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1.  Introduction  and  Summary 


This  report  describes  the  research  activities  in  the  Information  Systems 

C  '• 

Laboratory  (ISL)  at  Stanford  Universty  on  Distributed  Sensor  Networks  (DSN) .  Our 
effort  is  part  of~  a~  PARPft* sponsor od  program-wlth  participation  _bv— Several 

V. 

contractors.  The  objectives  are  to  develop  new  and  innovative  signal  processing 
and  computer  network  techniques  with  applications  to  systems  employing 
multiple  sensors  for  target  surveillance  and  tracking.  Such  systems  would  be 
composed  of  sensors,  processors,  and  data  bases  distributed  throughout  an  area, 
interconnected  by  a  suitable  communication  system.  The  system  will  serve  a 
user-community  that  is  also  distributed  and  connected  to  the  same  communication 
system. 

A  basic  premise  of  our  approach  is  that  most  of  the  currently  available  signal 
processing  algorithms  are  not  well  suited  to  the  DSN  problem  because  of  their 
highly  centralized  structure.  The  straightforward  application  of  standard 
techniques  in  the  context  of  distributed  networks  leads  to  ad  hoc,  suboptimal 
designs.  We  .*el  that  it  is  essential  to  have  a  more  careful  look  at  the  basic 
requirements  for  developing  and  implementing  distributed  algorithms.  Our 
preliminary  conclusions  were  that  improved  system  performance  and  a  more 
natural  system  structure  will  result  if  innovative  signal  processing  modules  will 
be  used:  new  analytical  and  computational  techniques  have  to  be  developed  and 
used  in  the  DSN  design,  rather  than  trying  to  adapt  this  problem  to  standard 
solutions.  Accordingly,  the  major  emphasis  of  our  research  effort  to  date  has  been 
the  development  of  novel  signal  processing  algorithms  which  are  especially  suited 
to  the  DSN.  ^  .  .  _ _  _ _ 

The  Impact  of  these  optimal  distributed  algorithms  on  the  overall  network 
design  was  also  addressed.  This  effort  is  currently  in  the  developmental  stage,  but 
some  very  promising  preliminary  results  have  already  been  obtained. 


K.v 


DSN  System  Structure 


The  Distributed  Sensor  Network,  contains  various  classes  of  sensors  and  Is 
required  to  provide  different  categories  of  user  Information.  Therefore,  the  DSN 
system  must  have  a  collection  of  signal  processing  modules  which  can  be  used  in 
different  combinations  according  to  the  sensors  adopted  and  the  type  of 
information  requested. 

Another  feature  of  the  DSN  is  the  wide  range  of  data  rates  appearing  in  different 
system  components:  from  the  wideband  "raw"  sensor  data  to  the  relatively  low 
bandwidth  summary  information  required  by  the  user.  A  related  aspect  is  the 
communication  bandwidth  requirements  between  different  system  components.  A 
local  set  of  sensors  may  have  to  exchange  high-rate  data  to  perform  the  necessary 
signal  processing  (e.g.  cross  correlation);  sensors  from  separate  regions  may  only 
need  to  exchange  low-rate  target  parameters  (e.g.  bearing  and  range  estimates)  to 
generate  location/velocity  estimates.  Finally,  the  communication  between  the  user 
and  the  network  may  be  at  even  lower  rates. 

•  This  type  of  basic  consideration  has  led  us  to  a  hierarchical  system  structure, 
where  different  levels  have  different  signal  processing  modules. 

At  the  top  are  the  user  entry  points  and  inter-user  communication.  The 
interface  between  the  user  and  the  network  also  includes  more  complex  functions 
such  as  combining  and  synthesizing  data  sets  provided  from  different  regions. 

At  the  next  level  are  the  regional  processors  which  can  assemble  the  answers  to 
user  questions  required  on  a  local  level:  e.g.  combine  bearing  measurements  from 
several  data  sources  to  give  target  location  estimates. 


Parameter-list  processing  Is  the  next  lower  level  in  the  system.  Here,  some 
initial  target  parameters  are  processed  to  provide  higher  quality  information.  For 
example:  time-difference-of-arrival  (TDOA)  information,  is  translated  into 

range/bearing  estimates.  This  level  performs  relatively  complex  computations,  so 
as  to  provide  the  regional  processor  with  easily  usable  data.  Different 
parameter-list  processors  will  communicate  at  moderate  data  rates. 

It  should  be  noted  that  the  proper  choice  of  parameters  to  be  used  by  this 
processor  (both  input  and  output)  has  a  major  impact  on  system  performance  and 
communication/computation  cost.  An  important  part  of  our  work  was  devoted  to 
developing  and  Identifying  appropriate  parametrlzations  for  the  target  location 
problem. 

At  the  lowest  level  are  the  front-end  processors  which  operate  directly  on  the 
"raw"  sensor  data.  These  processors  provide  relatively  low  quality  target  data 
parameters  which  are  then  processed  by  the  parameter-list  processor.  Local  sensor 
sites  may  need  to  communicate  at  high  bandwidth  to  compute  these  parameter 
estimates.  The  sensor  together  with  its  front-end  processors  can  be  thought  of  as  a 
single  "smart  sensor"  unit. 


Signal  Processing  Modules 


The  ISL  project  group  has  been  responsible  for  a  number  of  novel  theoretical 
developments  in  the  area  of  algorithms  for  signal  processing,  estimation  - 
Identification  and  control.  We  have  applied  these  approaches  to  the  development  of 
several  signal  processing  modules,  which  will  provide  the  components  for  the  DSN 
system  described  earlier.  These  includes 

-  Real-Time  ARMA  and  Delay-Differential  System  Modeling. 

This  is  a  new  approach  which  provides  a  unifying  framework  to 
multi-target  bearing  and  spectral  signature  estimation. 

-  The  Linear  Equation  Approach  to  the  Target  Location  Problem. 

Our  approach  provides  efficient  algorithms  for  computing  target 
location  from  time-of-arrival  differences,  in  a  distributed  fashion. 

-  Distributed  Kalman  Filtering  Algorithms. 

Some  ideas  from  scattering  theory  were  used  to  derive  new  distributed 
versions  of  the  Kalman  filter  applicable  to  the  target  tracking  problem. 

-  Image  Reconstruction  Techniques. 

Algorithms  which  were  developed  to  reconstruct  images  from  projections 
(e.g.  for  computer  assisted  tomography,  in  medical  applications)  are 
used  to  estimate  target  location  from  lists  of  partial  target  information 
(e.g.  bearing  only  or  range  only). 

-  Event  Detection. 

An  innovative  approach  to  handling  inordinate  amounts  of  sensor  data,  this 
class  of  algorithms  identifies  significant  "events"  in  the  data  (e.g. 
appearance  of  a  new  target,  or  change  of  course  of  a  target  under  track), 
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and  limits  the  processing  effort  to  the  neighborhood  of  these  events. 

-  Two-  and  Three-Dimensional  Signal  Processing. 

This  is  a  mathematical  framework  which  treats  the  target  location  problem 
as  a  multi-dimensional  imaging  problem. 


The  rest  of  this  report  contains: 

(i)  A  more  detailed  description  of  our  overall  approach  to 
the  distributed  sensor  network  organization,  section  2: 

Distributed  Sensor  Network  Design. 

(ii)  An  overview  of  the  signal  processing  modules  which  have  been 
developed  or  are  currently  under  development.  These  modules 
are  the  building  blocks  of  our  DSN  system  design,  section  3: 

Signal  Processing  Modules  for  DSN. 

(iii)  A  collection  of  three  papers  presenting  the  details  of  some 
of  our  processing  modules: 

A  System  Identification  Approach  to  Locating  Spatially 

Distributed  Targets. 

Source  Location  from  Time  Difference  of  Arrival: 

A  Linear  Equation  Approach. 

Distributed  Algorithms  for  Estimation  and  Detection. 


/.  A  J.   , 
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2.  Distributed  Sensor  Network  Design 


The  detailed  design  of  a  sensor  network  depends  intimately  upon  the  choice  of 
sensors,  the  characteristics  of  the  input  signals,  and  the  specifications  of  the  output 
requirements.  Thus,  only  the  outline  of  a  system  will  be  proposed.  Several  classes 
of  sensors  will  be  considered,  and  alternative  algorithms  will  be  introduced,  to  be 
selected  later  depending  upon  the  characteristics  of  the  input  signal. 

An  important  caveat  is  that  this  is  our  current  best  guess  at  an  organization  for 
such  a  network,  based  upon  our  recent  work  in  multichannel  and  distributed 
algorithms.  Some  aspects  of  this  design  are  well  understood,  others  are  very  poorly 
understood.  For  example,  performance  bounds  of  individual  algorithms  can  be 
readily  obtained?  the  interactions  and  overall  performance  of  systems  of 
algorithms,  however,  are  much  harder  to  predict.  In  addition,  the  proposed 
partitioning  into  modules  may  be  obviated  by  future  work. 

Figure  1  presents  a  preliminary  system  organization.  User  output  consists  of 
target  location  and,  perforce,  target  detection.  [  Other  target  information,  such  as 
reflection  coefficients,  acoustic  spectra,  etc.,  will  also  be  available  as  secondary 
results.]  Input  derives  from  three  classes  of  sensors:  omni-directional, 
single-dimensional,  and  multi-dimensional.  Omni-directional  sensors  are  rich  in 
information  content  but  provide  only  indirect  location  data.  The 
single-dimensional  sensors  provide,  for  example,  range-only  or  bearing-only 
information.  This  information  must  then  be  integrated  into  the  data  from  the  third 
class  of  sensors,  which  provides  location  information  (e.g.,  range  and  bearing) 
directly.  The  integration  process  should  take  into  account  that  measurements  have 
varying  error  probability  distributions  or,  in  the  worst  case,  are  erroneous. 
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Figure  1.  System  Slock  Diagram 


The  processing  can  be  partitioned  into  four  stages:  front-end  sensor  processing, 
parameter-list  processing,  regional  processing  and  the  user  interface.  Front-end 
processing  transforms  raw  sensor  data  into  target  paramters  suitable  for  refinement 
at  later  stages.  For  example,  the  omni-directional  sensor  processors  could  provide, 
as  output,  range,  bearing,  or  time-of-arrival  (TOA)  information.  This  step  typically 
requires  tight,  high-bandwidth  coupling  between  sensor  processors  if,  for 
example,  cross-correlation  is  required. 

The  parameter-list  stage  accepts  lists  of  target-parameters  as  input  and  produces, 
as  output,  components  of  the  location.  This  processing  step  typically  requires 
information  from  a  local  neighborhood  of  sensors:  a  good  example  would  be  the 
synthesis  of  target  location  from  time-of-arrival  input. 

At  the  third  stage,  information  from  regional  groups  of  sensors  is  combined.  The 
regions  may  be  large,  and  the  information  is  typically  imperfect  and  redundant. 
The  final  stage  then  integrates  all  information  into  an  output  desired  by  the  user 
community. 


Front-End  Processing 

Much  of  the  front-end  processing  is  best  described  in  the  context  of  developing 
an  algorithm,  such  as  the  Extended  Kalman  Filter,  which  cuts  across  many  stages  of 
v  ssing.  As  shown  in  figure  2,  however,  the  front-end  processing  required  for 
om.  iirectional  sensors  can  be  considered  as  an  integral  unit. 

\  ie  frequency-wavenumber  block  --  which  produces,  as  output  location 
inf oi  mation  --  and  the  long-baseline  cross-correlation  (without  pre-whitening)  — 
which  produces  time-difference-of -arrival  (TDOA)  data  --  were  presented  by 
Lincoln  Laboratories  in  their  initial  strawman  proposal  (Lincoln  [78]). 
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The  ARMA  modeling  block  (see  Section  IV  and  Appendix  B  of  [FM])  Is  a 
refinement  and  unification  of  these  approaches.  Using  high-resolution,  spectral 
estimation  (maximum  likelihood  or  maximum  entropy  techniques)  would  refine 
the  data  provided  by  the  strawman  algorithms.  More  precise  bearing  and  TDOA  data 
will  be  available,  as  well  as  a  host  of  other  auxiliary  parameters. 

The  fourth  block  (see  also  figure  3),  using  ladder  forms,  begins  with  a  data 

* 

compression  stage.  Each  sensor's  data  is  processed  independently  to  detect 
discernable  changes  in  the  data  (events),  compression  is  a  two  step  procedure:  first, 
the  data  passes  through  ladder-form  filters  which  identify  and  locate  changes, 
producing  a  log-likelihood  function;  second,  an  event  detector  scans  this  function, 
looking  for  significant  events  and  thereby  generating  TOA  data.  If  the  events  are 
precise  enough  this  data  can  be  combined  trivially  to  get  TDOA  information; 
alternatively,  the  raw-data  near  the  event  can  be  exchanged  among  sensors  and  a 
multi-channel  ladder-form  filter  can  be  used  to  generate  a  Joint  log-likelihood 
function,  and  TDOA  information  thereby. 

[  The  existence  of  "events"  or  similar  signal  characteristics  is  largely  an 
unexplored  question  at  this  moment.  Preliminary  analysis  of  various  signal  signal 
sources  has  shown  that  speech  and  certain  other  acoustic  signals  display  "events" 
(in  speech  these  could  be  either  plosive  sounds  or  voiced  speech).  The  recent 
availability  of  our  efficient  recursive  ladder-form  algorithms  enabled  the 
"discovery  process"  of  such  events.  ] 

These  four  algorithms,  which  convert  omni-directional  data  into  components  of 
a  location  vector,  can  be  compared  along  several  dimensions:  the  expected 
characteristics  of  the  input  signal,  the  accuracy  required  in  the  output,  the 
bandwidth  required  for  inter-sensor  communication,  and  the  algorithm's 
amenability  to  distribution. 


TOA  DIFFERENCES 
OR 

MULTICHANNEL 


LADDER  FORM 


The  frequency-wavenumber  processing  of  local  arrays  of  sensors,  as  proposed  by 
Lincoln  Laboratories,  is  a  good  example  of  an  independently  distributed  algorithm 
that  requires  minimal  communication  between  sensor  sites.  The  FFT  approach, 
however,  has  several  drawbacks,  since  information  is  thrown  away 
(cross-correlation  data)  and  since  the  resolution  is  limited  because  of  the  effect  of 
windowing  and  finite  record  length. 

Long-baseline  cross-correlation,  in  contrast,  distributes  very  poorly,  and 
requires  very  high  inter-sensor  communication  bandwidth.  Also,  for  the  Lincoln 
Laboratory  study,  the  algorithm  produced  poor  results.  Pre-whitening  the  data  on 
input  may  improve  the  output  this  block,  but  long-baseline  inteferometry 
processing  will  still  require  high-bandwidth  inter-sensor  exchange  of  the  raw 
data. 

The  ARMA  modeling  is  an  intermediate  solution,  which  replaces  both  the  FFT 
and  the  cross-correlation  processes;  as  would  be  expected,  this  approach  also 
typically  requires  high-bandwith  data  exchange.  However,  these  algorithms 
promise  to  yield  superior  results;  assumptions  concerning  periodicity  and  the  noise 
spectrum  have  not  been  made,  and  time-varying  sources  can  be  tracked.  In 
addition,  and  in  contrast  to  FFT  and  cross-correlation  algorithms,  ARMA  modeling 
produces  a  minimal  parameterization  (parsimonious  output),  introducing  well 
behaved  data  reduction  at  this  early  stage  of  analysis. 

The  communication  drawback  associated  with  ARMA  modeling  can  be  partially 
alleviated  if  we  can  use  Matrix  Fraction  Description  Models  (MFD)  in  the  ARMA 
block  (see  section  4).  Then,  the  number  of  variables  communicated  is  neither  a 
function  of  the  number  of  sensors  nor  of  the  amount  of  raw  sensor  data,  but  rather 
is  a  function  of  the  needs  of  the  user  (i.e.,  of  later  stages  of  processing). 


The  fourth  block  distributes  well  and  promises  good  communication 


characteristics  (as  can  be  seen  from  figure  3)i  the  compression  process  implies  only 
exchanging  data  about  significant  events,  and  possibly  the  raw  data  immediately 
surrounding  those  events.  The  first  two  stages  are  independently  distributed,  and 
require  no  inter-process  communication.  The  output  is  raw  sensor  data,  but  the 
event  detector  only  produces  packets  immediately  surrounding  statistically 
significant  events.  The  final  step,  the  determination  of  TDOA  information,  requires 
inter-process  communication,  but  if  the  quality  of  events  is  very  high,  only  the 
time  of  the  events  needs  to  be  communicated;  in  the  worst  case,  the  packets  of  raw 
data  would  be  exchanged,  which  would  still  have  a  comparatively  low-bandwidth. 
Thus,  if  these  algorithms  are  applicable  to  specific  instantiations  of  the  DSN 
concept,  then  we  will  have  the  high-resolution  of  an  ARMA  model  with  a  very 
high  degree  of  distribution  and  comparably  low  communication  bandwidth 
requirements. 

In  summary,  turning  again  to  figure  2,  we  see  that  raw  sensor  data  provides 
input  to  the  front-end  processors.  As  output,  we  receive  parameter-list 
information,  often  redundantly:  two  algorithms  provide  bearing  information, 
three  provide  TDOA  data,  and  several  provide  target  paramters.  This  data,  with  its 
associated  error  statistics,  will  be  integrated  in  subsequent  processing  stages. 


Parameter-List  Processing 

Given  time-difference-of-arrival  Information  as  input  information,  the 
traditional  algorithm  for  determining  location  Involves  combining  results  from  at 
least  three  sensors  to  determine  the  intersection  of  a  set  of  non-linear  curves  (see 
figure  4).  The  computational  burden  of  such  an  approach  is  extremely  high.  With 
a  sufficient  number  of  sensors,  however,  it  is  possible  to  determine  the  target 
location  using  linear  equations.  This  significantly  reduces  the  computational 


burden,  simplifies  the  problem  of  distributing  the  algorithm,  and  provides  for 
graceful  degradation  as  sensors  fail.  The  largely  open  questions  concern  the 
behavior  of  these  methods  in  the  presence  of  noise,  finding  effective  organizations 
for  distributing  these  algorithms  while  retaining  robustness,  and  determining  the 
communication  bandwidth  requirements.  Our  approach  has  the  potential  of 
producing  algorithms  that  have  very  favorable  answers  to  these  questions,  (see 
Section  6.) 

Extended  Kalman  filters  are  another  class  of  parameter-list  processing 
algorithms}  a  basically  non-linear  problem  is  continually  re-linearized,  with  all 
unknown  parameters  included  in  the  state.  Although  these  methods  are  very 
popular  in  current  centralized  approaches,  cooperatively  distributed  forms  with 
reasonable  communication  bandwidths  have  been  difficult  to  obtain.  At  the 
moment  we  can  only  suggest  independently  distributed  approaches  that  are  based 
on  efficient  exchanges  of  data,  coupled  perhaps  with  "event"  detectors. 

The  difficulties  with  this  approach  arise  because  the  linearization  of  the 
underlying  nonlinear  model  and  measurement  equations  obscures  the  structure  of 
the  generally  well  understood  underlying  physical  processes.  'Furthermore,  the 
nonlinearities  make  a  performance  analysis  of  these  distributed  filters  very  hard  to 
obtain. 
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Figure  4.  DSN  System  Structure 


Regional  Processing 


Given  information  such  as  bearing-only  or  range-only  data,  back-projection  is  a 
well  known  technique  for  integrating  this  partial  data  into  a  single  framework 
that  will  localize  the  target  position.  Intrinsically,  this  stage  of  data  integration 
requires  that  each  processing  site  have  data  from  a  large  number  of  adjacent  sites, 
i.e.  those  other  sites  whose  sensors  overlap  the  site  in  question.  At  this  level, 
however,  the  communication  requirements  are  typically  reduced  compared  to 
earlier  stages,  although  the  computational  burden  may  be  very  large. 

The  important  enhancement  we  propose  for  this  stage  introduces  error 
assessments  into  the  back-projection  process.  With  most  sensors,  some 
measurements  have  a  greater  confidence  associated  with  them  than  others;  this 
information  should  be  incorporated,  especially  when  using  disparate  sensor  types 
or  redundant  measurements.  The  thesis  of  S.  Wood  [Wood]  has  analyzed  the 
applicability  and  performance  of  state  estimators  to  this  problem.  She  demonstrated 
that  significant  performance  improvements  can  be  achieved  over  naive  techniques, 
especially  for  ill-determined,  constrained,  or  high  signal-to-noise-ratio  data. 


Summary 

An  outline  of  possible  component  blocks  for  a  distributed  sensor  network  with 
cooperatively  distributed  processes  has  been  shown  in  figure  4.  Various 
constituent  modules  have  been  proposed;  the  choice  of  which  blocks  to  implement 
in  a  given  system  depends  upon  the  sensors  available  and  the  characteristics  of  the 
data.  Several  modules  redundantly  compute,  for  example,  bearing  or  TOA 
information;  only  from  an  analysis  of  specific  applications  can  we  select  the 
optimal  configuration. 
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3.  Signal  Processing  Modules  for  DSN 


This  section  presents  a  brief  summary  of  the  types  of  algorithms  we  have 
investigated,  the  results  obtained,  and  the  various  questions  that  still  need  to  be 
answered.  The  approaches  we  have  used  can  be  classified  into  the  following 
categories: 

-  Real  Time  Source  Modeling  (System  Identification, 

-  Constant  Sotirce  Parameter  Models 

-  Matrix  Fraction  Description  Models 

-  Ladder  Forms  for  "Event"  Detection, 

-  Multisite  Lateration  (MSL) 

-  Distributed  State  Estimation  and  Hypothesis  Testing  (DSE) 


-  Distributed  Information  Filters  (DIF) 

-  Image  Reconstruction  Techniques  (IRT) 

-  Two  and  Three  Dimensional  Signal  Processing  (2&3-D  Systems) 

-  Partial  Differential  Equation  Models  (PDE) 

-  Delay  Differential  Equation  Models  (DDE) 


ARMA) 

(CSPM) 

(MFD) 

(LLL) 


3.1.  Real  Time  Source  Modeling 

We  are  investigating  a  new  approach  to  target  bearing  (or  time-of-arrival) 
estimation  by  using  system  identification  techniques.  In  this  approach  the 
parameters  of  an  auto-regressive  moving  average  (ARMA)  model  are  chosen  for  best 
fit  to  the  observed  time  series.  If  the  source  is  purely  autoregressive  it  can  be 
shown  that  the  moving  average  part  of  the  model  contains  direct  information  about 


the  TDOA,  while  the  auto-regressive  part  represents  the  dynamics  of  the  source 
process.  Thus,  identifying  a  system  model  provides  information  about  the  source 
bearing  and  about  its  spectrum. 

Following  this  observation,  we  developed  a  new  estimation  scheme  for  TDOA 
which  has  a  number  of  potential  advantages  over  the  more  conventional  estimator 
structures! 

-  No  prior  knowledge  of  the  noise  spectrum  is  required, 
since  the  proposed  scheme  automatically  performs  spectral 
estimation. 

-  By  adaptively  varying  the  ARMA  coefficients  to  minimize  an  error 
criterion  this  estimator  can  handle  nonstationary  source  processes 
(time- varying  statistics)  or  moving  sources  (time-varying  TDOA) 

-  Efficient  recursive  algorithms  are  available  for  computing 
ARMA  coefficients  for  vector  time  series. 

In  summary,  we  applied  system  identification  techniques  to  obtain  a 
computationally  efficient  solution  to  the  TDOA  estimation  problem,  using  our  "fast 
algorithms".  Other  techniques  for  identifying  the  ARMA  coefficients  can  be 
applied:  repeated  least  squares,  generalized  least  squares,  maximum  likelihood, 
instrumental  variables,  etc. 

1 

A  major  challenge  of  this  approach  is  to  obtain  distributed  forms  of  these 
algorithms,  because  of  the  inherent  high  complexity  and  tight  coupling  within 
these  algorithms.  One  of  the  more  promising  methods  we  are  exploring  is  based  on 
information  filter  forms  and  "Matrix  Fraction  Descriptions"  (MFD’s),  a  dual 
representation  of  ARMA  models.  These  forms  can  take  better  advantage  of  the 
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inherent  physical  structure  of  the  source  location  and  characterization  problem, 
e.g.  the  modeling  of  the  "cascade"  of  source,  propagation  medium  and  receivers. 
This  combination  can  actually  more  accurately  be  described  by  delay-differential 
equations,  also  a  topic  under  investigation. 

One  potentially  very  important  feature  of  these  alternative  forms  is  that  their 
required  communication  bandwidths  will  not  be  a  function  of  the  number  of 
sensors  Involved  in  a  DSN  but  rather  the  number  of  sources.  This  is  due  to  the  fact 
that  these  models  can  separate  the  source  and  propagation  models  effectively.  An 
other  feature  is  the  possible  trade-offs,  such  as  storage  versus  communication.  For 
example,  if  in  the  source  spectral  information  can  be  made  available  to  the  sensors 
and  estimates  are  required  only  relatively  infrequently,  the  number  of  variables 
communicated  in  the  network,  is  again  neither  a  function  of  the  number  of  sensors 
nor  of  the  amount  of  data  gathered  at  the  sensors.  In  other  words  the  sensor  can 
effectively  concentrate  the  received  data,  only  communicating  data  that  is  of 
interest  to  the  user.  Section  6  describes  some  of  the  forms  of  distributed  algorithms 
required  for  this  application.  The  distributed  algorithm  for  this  case  would  in 
addition  take  advantage  of  Matrix  Fraction  Descriptions,  that  can  effectively  be 
used  to  provide  beam-forming  and  subsequent  Optimal  filtering. 

A  detailed  description  of  our  results  in  this  area  can  be  found  in  section  4  of  this 
report:  "A  System  Identification  Approach  to  Locating  Spatially  Distributed 
Targets." 


Ladder  Forms  for  Log-Likelihood  Detection 

In  a  separate  report  on  "Fast  Parameter  Tracking  via  Ladder  Forms"  [ML]  we  are 
reporting  on  the  development  and  applications  of  ladder  forms  for  on-line 
parameter  estimation  and  tracking  for  single  and  multi-channel  data.  One  of  the 


major  "discoveries"  made  in  this  report  is  the  appearance  of  certain  likelihood 
variables  as  optimal  stepsizes  in  these  recursive  algorithms.  Their  extremely 
interesting  behavior  to  jump  components  or  "outliers"  in  the  data  create  an  interest 
in  these  variables  in  their  own  right,  as  well  as  in  the  many  potential  applications. 
An  innovative  approach  to  handling  inordinate  amounts  of  sensor  data,  these 
algorithms  can  identify  significant  "events"  in  the  data  (e.g.  any  discontinuity  in  a 
source,  such  as  a  change  of  course  of  a  target  under  track).  These  discontinuities 
effectively  provide  a  "sample-and-hold"  capability  to  a  parameter  tracking 
algorithm,  hence  limiting  the  necessary  processing  effort  to  the  neighborhood  of 
these  events,  and  providing  a  "time  marking"  capability  of  signals  (useful  for  TOA 
estimation).  In  addition  ladder  forms  are  also  very  useful  in  data  compression  in 
order  to  reduce  storage  and  communication  requirements.  Some  of  these 
applications  were  explored  in  the  context  of  speech  and  other  signals,  see  also 
[ML],  An  example  of  processed  acoustic  data  from  Lincoln  Labs  (Lincoln  [78])  is 
given  in  figure  5,  where  the  "original"  seismic  signal  during  the  closest  approach 
of  a  fly-by,  the  i-th  reflection  coefficient  K{i)  and  the  likelihood  variable  gamma(i ) 
is  displayed.  We  note  that  even  though  the  data  was  highly  processed,  the 
likelihood  variable  clearly  shows  distinct  impulse  like  components  or  "events". 
Also  note  the  change  in  the  signum  of  K(l),  a  sign  for  the  change  in  doppler 
frequency  from  positive  to  negative.  These  examples  are  clearly  preliminary 
results,  but  they  show  the  promise  of  our  approach. 
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Data:  A7-jet;  Receiver:  geophone,  SAV; 
Samples:  1926-2025;  Time:  6/27/77  5:36:30.00 


Figure  5a. 
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Data:  A7-jet;  Receiver:  geophone,  SAV; 
Samples:  1926-2025;  Time:  6/27/77  5:36:30.00; 


Figure  5b. 


3.2.  Multi-Site  Lateration  (MSL) 


Progress  has  been  made  on  a  solution  to  the  multisite  lateration  problem  using 
only  time-of-arrival  measurements  which  require  only  solutions  to  linear 
equations.  This  possibility  was  announced  at  an  earlier  site  visit,  and  has  now  been 
confirmed  (see  also  [Schmidt]).  The  significance  of  demonstrating  the  linearity  of 
these  equations  arises  because  linearity  is  a  sufficient  condition  for  a  distributed 
form  of  the  algorithm.  Any  subset  of  sensors  can  optimally  combine  estimates  of 
source/sensor  coordinates,  and  these  estimates  can  be  combined  globally  using  a 
variety  of  hierarchical  organizations.  The  advantage  of  the  linear  equations  is  their 
inherent  fail-soft  property  from  N  down  to  4  stations,  i.e.,  we  have  an  "automatic 
re-configuration",  since  delayed  or  even  missing  measurement  beyond  at  least  4 
independent  measurements  have  no  catastrophic  failure  effects.  The  additional  or 
missing  measurements  only  raise  or  lower  the  confidence  in  the  estimates  to  some 
degree.  (Even  the  4  station  (linear)  solution  has  at  least  a  single  non-linear 
fail-soft  property,  i.e.  nonlinear  solutions  can  be  used  If  one  link  fails.)  DSN 
systems  based  on  "minimal"  but  nonlinear  3  independent  station  solutions  however 
can  not  fail-soft1.  A  single  measurement  or  link  failure  can  give  catastrophic 
degradations  i.e.  a  breakdown  of  the  whole  system  in  the  worst  case,  since  no 
redundancy  would  be  present,  a  case  that  should  definitely  be  avoided! 

It  is  perhaps  surprising,  but  not  hard  to  prove  by  verification,  that  a  linear  set 
of  equations  can  be  given  for  the  Euclidean  coordinates  of  a  source  given  range  or 
range-difference  measurements  (or  equivalently  TOA  or  TDOA  measurements.)  The 
fact  that  the  multi-site  lateration  problem  (a  set  of  quadratic  equations)  can  be 
reformulated  as  a  problem  of  solving  linear  equations,  is  reminiscent  of  a  similar 
situation  in  least-squares  estimation,  where  solutions  to  quadratic  equations 
(quadratic  cost)  can  be  obtained  solving  linear  equations  (using  orthogonal 
transformations).  The  multi-site  lateration  problem  has  the  flavor  of  an  Inverse 
least-squares  problem.  A  detailed  description  of  one  of  our  approaches  to  this 


problem  can  be  found  In  section  5  of  this  report  "  Source  Location  from  Time 
Difference  of  Arrival:  A  Linear  Equation  Approach." 

3,3.  Distributed  State  Estimation  and  Hypothesis  Testing 

Since  many  of  the  estimation  schemes  are  based  on  state  estimation  formulation 
(e.g.  the  Kalman  Filter),  it  is  important  to  investigate  distributed  versions  of  state 
estimators.  A  simple  form  of  a  distributed  information  filter  was  presented  in  the 
appendix  of  our  first  DSN  report.  Various  attempts  have  been  made  in  the  past  to 
distribute  the  Kalman  Filter.  However,  its  inherent  complexity  and  the  high 
interaction  between  variables  makes  this  virtually  impossible,  in  the  sense  that 
either  a  lot  of  communication  will  be  necessary  (in  a  cooperatively  distributed 
approach)  or  many  variables  have  to  be  recomputed  locally  (in  an  independently 
distributed  approach).  The  information  filter  form  appears  at  the  moment  to  be 
better  adapted  to  distributed  computations  than  the  covariance  (or  Kalman)  filter 
equations.  In  some  cases,  such  a  brute  force  implementation  or  distribution  might 
be  unavoidable.  However,  there  are  interesting  cases  where  efficient  distribution 
of  the  computations  can  be  achieved  by  taking  advantage  of  the  structure  of  the 
problem.  Triangular  coupling  of  subsystems  and  degeneracies  or  non-existence  of 
driving  noises  leads  to  efficiently  distributed  algorithms.  Fixed  sources  or  moving 
targets  with  piece-wise  linear  tracks  are  all  examples  of  CSPM's. 

An  alternative  solution  is  possible  if  global  estimates  rarely  have  to  be 
computed,  or  local  estimates  have  to  be  "synchronized"  at  a  slower  rate  than  the 
data  that  is  gathered  at  the  sensors.  If  these  synchronization  time  points  and  the 
organization  of  sensors  for  obtaining  a  global  estimate  are  agreed  upon  by  the 
sensor  sites  by  the  time  of  the  start  of  a  new  period,  we  can  use  our  parallel 
processing  algorithms,  (see  [Morf,  Dobbins,  Friedlander,  Kailath],  [Dobbins])  to 
develop  essentially  independently  distributed  algorithms  that  are  communication 
bandwidth  constrained.  We  can  either  fix  a  maximal  deviation  in  the 


error-covariance  of  all  estimates  and  determine  the  necessary  synchronization 
intervals,  or  we  can  determine  these  intervals  by  fixing  the  communication 
bandwidth  and  thereby  limiting  the  number  of  information  exchanges  necessary  to 
achieve  the  global  synchronization.  The  equations  can  be  derived  relatively  easily 
using  our  scattering  theory  approach  to  least-squares  estimation,  see  e.g. 
[Friedlander].  A  detailed  description  can  be  found  in  section  6  of  this  report: 
"Distributed  Algorithms  for  Estimation  and  Detection." 
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3.4.  Two  and  Three  Dimensional  Signal  Processing 

Our  work  on  two  and  three  dimensional  signal  processing  is  characterized  by 
two  efforts. 

a)  A  basic  mathematical  approach  has  been  developed  that  is  aimed  at  providing  the 
proper  mathematical  tools  such  as  state  space  models  and  transfer  function  or 
frequency  domain  representations.  We  have,  in  the  meantime,  obtained  very  basic 
results  that  may  have  a  great  potential  impact  on  the  problem  of  obtaining  truly 
distributed  solutions  to  two  and  three  dimensional  signal  processing,  see  [Levy], 
[Levy.Morf].  However,  their  real  significance  is  still  hard  to  evaluate,  partially 
because  one  of  the  basic  results  we  are  hoping  to  obtain  is  still  missing:  a 
demonstration  of  the  existence  of  innovations  representations  for  two  and  three 
dimensions. 

One  of  our  basic  results  shows  that  some  of  the  structure  of  two  dimensional 
problems  are  definitely  different  from  three  dimensional  equivalents.  For 
example,  the  idea  of  implementing  filters  in  a  cascade  form,  very  basic  In  one 
dimension,  generalizes  (even  for  multiple  channels)  to  two  dimensions;  however, 
this  is  not  true  in  three  dimensions.  This  result  does  not  imply  that  solutions  do 
not  exist  in  three  dimensions,  it  only  points  out  that  solutions  in  three  dimensions 
can  look  quite  different  from  the  lower  (one  and  two)  dimension  equivalents. 

b)  As  discussed  in  the  Tutorial  Survey  presented  at  CMU,  one  method  for  estimating 
source  probability  distributions  is  based  on  image  reconstruction  techniques,  (IRT). 
The  comparatively  simple  back- projection  technique,  (see  e.g.  [Lincoln  Lab,  CMU 
workshop],)  does  not,  however,  lead  to  the  best  estimate  of  the  source  distribution, 
but  rather  to  what  is  known  in  the  estimation  literature  as  a  special  case  of  the 
information  state,  see  e.g.  [Wood],  More  precisely,  the  Information  state  is  the 
product  of  the  best  estimate  times  the  inverse  of  the  error-covariance:  a  "smeared 


out"  version  of  the  best  estimate.  The  best  estimate  can  be  obtained  either  by 
carrying  along  the  covariance  matrix  (recursively)  and  multiplying  the 
information  state  by  the  inverse,  or  by  recursively  computing  the  best  estimate 
directly.  We  showed  that  the  direct  approach  is  equivalent  in  the  continuous  limit 
to  the  so-called  "convolve  and  back  project"  solution,  whereas  the  information 
filter  approach  is  a  "first  back-project  then  convolve"  solution.  The  latter 
currently  appears  to  be  more  adapted  to  distributed  computations;  however,  one  of 
our  earlier  fundamental  results  showed  that  both  approaches  are  linked  via  an 
inversion  duality,  hence  ultimately  we  might  also  be  able  to  find  efficient 
distributed  algorithms  for  the  direct  form.  Both  forms  generally  involve  carrying 
along  a  form  of  the  covariance  matrix  (a  matrix  with  n2/ 2  entries  in  general,  where 
n  is  the  number  of  variables  to  be  estimated).  The  memory  requirements  for  a 
brute-force  implementation  of  these  methods  could,  therefore,  be  prohibitive.  One 
advantage,  however,  is  that  no  limit  on  the  number  of  targets  is  required,  since  a 
spatial  discretization  is  used.  This  implies  that  targets  are  lumped  together  if  they 
are  within  a  single  spatial  "bin".  Other  advantages  of  this  approach  derive  from  the 
linearity  of  the  optimal  estimator  (for  Gaussian  distributions),  from  flexible 
distributing  computations,  robustness  of  the  source  distribution  estimates,  and 
"automatic  reconfiguration"  after  sensor  losses.  We  are  still  studying  the 
implications  of  this  approach  on  computational  and  communication  requirements, 
as  well  as  other  system  parameters.  Our  recent  results  in  2-D  systems  [Levy.Morf] 
are  expected  to  play  an  important  role  in  this  context,  in  particular  in  displaying 
and  exploiting  sparseness  of  matrices  for  computational  and  other  benefits. 

3.4.1  Delay  Differential  Equation  Models 
Since  sources  emit  signals  that  generally  have  a  line  spectrum  and  that 
propagate  (i.e.  are  delayed)  through  a  medium,  we  can  conclude  that  natural 
mathematical  models  are  systems  of  delay-differential  systems.  B.  Levy  was  able  to 
show  in  his  thesis  that  in  the  generic  case  at  least  two  sensors  are  required  in  order 
to  be  able  to  observe  (or  two  inputs  in  order  to  control)  a  delay-differential  system. 


This  implies  that  a  (generally  moving)  source  can  be  located  from  appropriate 
measurements  of  only  two  sensors,  e.g.  time  and  frequency  of  arrival  type 
measurements  of  two  sensors  can  be  sufficient  for  estimating  the  trajectory  of  a 
source!  This  is  a  very  fundamental  mathematical  result;  its  full  Impact  on  the  DSN 
problem  remains  to  be  determined. 

We  may  note  that  our  other  approaches  that  are  based  on  ARMA  and  MFD  models 
essentially  make  use  of  discretized  delay-differential  equations,  hence  their 
structure  is  inherited  by  the  discretized  equivalents. 
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Abstract 

A  new  framework  is  presented  for  treating  the  problem  of  estimating  the 
location  of  a  multiple  of  point  sources  from  multi-sensor  measurements. 
Estimating  the  source  location  is  formulated  as  a  system  identification  problem  for 
the  system  consisting  of  the  source  and  propagation  models.  It  is  shown  how  the 
time  difference  of  arrival  (TDOA)  can  be  found  from  the  system  parameters. 
Features  of  the  proposed  approach  include:  simultaneous  estimation  of  multiple 
sources,  capability  of  handling  multipath  propagation  and  nonstationarlty  of  the 
source  and  noise  processes,  and  simultaneous  estimation  of  the  source  location  and 
its  spectral  "signature".  The  basic  concepts  are  described  in  detail  and  some  sample 
algorithms  are  presented.  A  number  of  interesting  connections  between  the  source 
location  problem  and  system  theoretic  issues  are  presented. 
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I.  Introduction 


In  many  practical  problems  it  is  necessary  to  determine  the  location  of  signal 
(and  noise)  sources  from  measurements  provided  by  one  or  more  sensors.  Typical 
applications  include: 

-  Acoustic  surveillance  systems  (e.g.,  detection  of  low  flying  aircraft); 

-  Seismic  arrays  for  seismic  exploration,  monitoring  earthquakes  and 
nuclear  explosions,  or  detecting  vehicle  movements; 

-  Antenna  arrays  for  radio  astronomy  or  electronic  surveillance 
(e.g.,  direction  finding); 

-  Multiple  radar  systems  for  detection  and  tracking. 

In  the  simplest  case  the  sensor  outputs  consist  of  amplitude  scaled  and  delayed 
replicas  of  the  waveform  from  the  distant  source,  corrupted  by  additive  noise.  One 
of  the  most  widely  used  methods  of  estimating  the  location  of  the  source  is  by 
finding  the  time-difference-of-arrival  (TDOA)  of  the  propagating  signal  to  the 
different  sensors.  It  has  been  shown  [1-2]  that  all  the  information  about  the 
source  location  (range  and  bearing)  is  contained  in  these  time-of-arrival 
differences  (TDOA's),  or  equivalently,  in  the  difference  of  the  sensor/source  range 
for  a  pair  of  sensors.  In  other  words,  estimating  the  source  location  is  equivalent  in 
some  sense  to  estimating  TDOA's  for  the  various  sensors. 

Methods  for  estimating  TDOA,  such  as  maximum  likelihood  estimators,  were 
developed  by  several  authors  [3-4].  These  estimators  are  usually  implemented  by 
using  one  of  the  following  structures: 


(i)  A  beamformer  (with  appropriate  pre-filtering)  followed  by  a  square-law 
device  and  an  integrator. 

(ii)  Pre-filtering  followed  by  a  multiplier  correlator. 

The  computations  are  often  performed  using  Fourier  transforms,  but  time- 
domain  implementations  are  also  quite  common.  Various  suboptimal  TDOA 
estimators  have  also  been  derived  [4],  The  TDOA  estimator  is  by  no  means  the  only 
method  for  estimating  source  location,  but  it  provides  a  convenient  prototype  for 
our  discussion,  for  details  about  other  approaches,  see  [5-8]. 

These  estimators  have  been  successfully  used  on  a  variety  of  applications.  There 
are,  however,  certain  aspects  of  the  currently  used  techniques  which  are 
somewhat  unsatisfactory.  For  example* 

-  Most  of  the  estimators  are  designed  to  be  optimal  for  the  single  source 
case.  When  multiple  sources  are  present  they  are  considered  one  at  a  time 
(while  other  sources  are  thought  of  as  "interference").  In  general  the 
treatment  of  multiple  source  proceeded  in  an  'ad  hoc'  fashion  and  there 

is  a  need  for  a  more  systematic  approach. 

-  The  performance  of  most  estimators  is  degraded  in  the  presence  of  multipath 
propagation. 

-  These  approaches  are  highly  centralized  and  do  not  seem  to  be  easily 
adaptable  to  distributed  processing.  The  current  trend  toward  using  multiple 
micro-  and  mini-computers  rather  than  a  single  powerful  central  computer 
makes  it  very  advisable  to  develop  algorithms  suitable  for  distributed 
processing.  In  addition  to  cost  effectiveness,  distributed  systems  have 


other  potential  advantages  like  speed  and  high  reliability  (survivability). 

These  and  other  consideration  provide  motivation  for  searching  alternative 
approaches  to  the  source  location  problem. 

In  this  paper  we  introduce  a  new  framework  for  estimating  TDOA's  or  source 
location.  The  objective  of  this  paper  is  to  outline  our  approach  and  indicate  some 
interesting  connections  between  the  source  location  problem  and  system  theoretic 
issues.  We  limited  ourselves  here  to  a  general  discussion  of  the  issues  and  deferred 
many  of  the  details  to  later  papers. 

The  basic  idea  is  to  use  system  identification  techniques  to  estimate  the 
parameters  of  the  system  comprised  of  the  source  (or  sources)  and  the  sensors.  The 
TDOA  estimates  can  then  be  easily  found  (even  for  multiple  sources)  from  these 
system  parameters  as  will  be  shown  in  the  next  section. 

The  proposed  approach  leads  to  new  estimator  structures,  quite  different  from 
the  classical  beamformer  or  cross-correlator.  This  new  estimator  has  a  number  of 
desirable  features  including: 

(i)  Estimating  multiple-source  locations. 

The  conventional  TDOA  estimators  treat  a  single  source  at  a  time  and  treat  any 
other  sources  as  noise  or  interference.  In  Section  III  we  will  present  an  optimal 
estimator  for  multiple-sources. 

(ii)  Simultaneous  estimation  of  source  location  and  its  spectrum. 


In  many  applications  (e.g.  passive  sonar)  one  wishes  to  estimate  the  source 
spectrum  for  recognizing  or  classifying  the  source  type  (i.e.  finding  its  spectral 
"signature")  or  for  other  purposes.  The  process  of  spectral  estimation  is  usually 
carried  out  separately  from  the  source  location  estimation.  In  the  proposed 
approach  these  two  aspects  of  the  problem  can  be  handled  simultaneously.  This 
leads  to  the  possibility  of  using  a  priori  information  about  the  source  spectrum  to 
enhance  detection  of  specific  source  types  while  rejecting  other  source  types.  In 
other  word,  the  approach  has  a  certain  built  in  discrimination  capability. 

(iii)  Capability  of  handling  multipath  propagation. 

In  many  cases  the  source  signals  find  several  different  propagation  paths  (with 
different  delays)  to  the  sensor.  These  multipath  signals  are  capable  of  "confusing" 
some  of  the  TDOA  estimators  and  degrading  their  performance.  Insensitivity  to 
multipath  propagation  is  an  inherent  feature  of  our  approach,  as  will  be  discussed 
in  Section  II. 

(iv)  Tracking  moving  sources  and  adaptivity. 

Some  implementations  of  our  proposed  approach  (Appendix  A  and  B)  are  capable 
of  tracking  (even  rapidly)  time-varying  system  parameters  and  therefore  capable  of 
tracking  moving  sources.  Furthermore,  these  algorithms  are  capable  of  performing 
optimal  estimation  (e.g.  in  the  least-squares  or  maximum  likelihood  sense)  for 
nonstationary  processes.  This  is  a  favorable  property  since  in  many  applications  the 
assumption  of  stationarity  does  not  hold  (even  if  the  source  is  not  moving).  In  fact, 
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the  adaptive  nature  of  these  algorithms  makes  possible  optimal  estimation  without 
prior  knowledge  of  the  source  or  noise  statistics. 

The  structure  of  this  paper  is  as  follows:  in  Section  II  the  basic  approach  is 
described  in  detail  for  the  case  of  a  single  autoregressive  (AR)  source.  In  Section  III 
the  approach  is  applied  to  multiple  AR  sources,  and  some  of  the  difficulties 
involved  are  discussed.  Section  IV  summarizes  possible  extensions  of  our  approach 
to  a  more  general  class  of  sources  (  autoregressive  moving-average,  ARMA),  to 
moving  sources  and  to  direct  estimation  of  source  location  (rather  than  TDOA). 

Finally,  it  should  be  noted  that  the  emphasis  in  this  paper  is  on  the  conceptual 
development  of  a  unifying  framework  for  processing  multi-sensor  multi-source 
data.  We  will  not  go  into  the  details  of  deriving  the  actual  algorithms  (some  of 
which  are  presented  in  the  appendices)  or  into  various  numerical  and  other 
practical  issues  involved  in  the  actual  application  of  such  algorithms.  These  will 


be  treated  in  later  papers. 


Fig.  1  Two  sensors  and  one  source. 


II.  A  Single  Autoregressive  Source 


To  illustrate  the  basic  ideas  of  our  approach  we  start  with  the  following  simple 
problem  as  shown  in  Figure  1.  Two  sensors  are  measuring  the  signal  propagating 
from  a  source  located  somewhere  in  the  plane.  We  assume  that  the  propagation 
involves  only  some  time  delays  and  attenuation.  Thus,  the  outputs  y^,  y2  of  the 
two  sensors  can  be  modeled  as: 

yi  (t)  -  x(t-T  j)  +  7ij(/)  (la) 

yz(t)  -  cx{t-r2)  +  n2(r)  ,  (lb) 

where  Tj,  r2  are  the  propagation  delays  from  the  source  to  the  two  sensors,  c 
represents  attenuation  and  rij,  n2  are  independent  measurement  noise  processes. 
The  time  sampled  version  of  these  outputs  will  be  written  as: 

y^k)  -  x(k-D{)  +  n{(k)  (2a) 

y?(k)  -  c  x(k-D2)  +  n2(k)  ,  (2b) 

where 

t  ■  k  Ar  ,  tj  -  Dj  Af  ,  t2  -  D2  Ar. 

Note  that  the  delays  Tj,  T 2  are  assumed  to  be  integer  multiples  of  the  sampling 
period.  No  difficulties  arise  when  r  j,  T2  are  non-integer  multiples  provided  that 
the  sampling  period  At  is  properly  chosen.  This  point  will  be  discussed  in  more 
detail  later. 


The  source  process  x(k)  is  an  autoregressive  process  of  order  n,  i.e. 

(i 

x(k)  *  -  ^  ai  *^-0  +  “(*)  • 
i«I 


(3) 


where  u(k)  is  a  white  driving  process.  Taking  the  z-transform  of  equation  (2)  we 


get 


y^z)  -  z*»l  X(z)  +  N j(z)  -  z-°l  +  A/j(z)  (4a) 

K2(z)  -  c  z‘D2  X(z)  +  N2(z)  -  c  z'D2  ^  +  A/2(z)  ,  (4b) 


where 


a(z)  -  i  +  2  ai 
i-1 


(5) 


Written  in  vector  form  the  transfer  function  from  the  driving  process  u  to  the 
sensor  outputs  y2  is 

Y(z)  -  B(z)  X(z)  +  N{z)  -  B(z)  ^  0(z>  +  N(z)  ,  (6a) 

where 


B(z)  - 

1 

'n 

-o 

■ 

Y»i  ' 

1 

."■“t . 

(6b) 


Note  that  the  numerator  of  this  transfer  function  contains  the  information  about 
the  target  location  (i.e.  the  TDOA),  while  the  denominator  contains  the  dynamics  of 
the  source  process. 


This  simple  example  suggests  the  following  method  for  estimating  TDOA:  fit  an 
ARMA  model  to  the  observed  time  series  y{k).  That  is,  find  coefficients  {  ait  Bi  }, 
where 


it  it 

y(k)  «  -  ^  ai  #'*)  +  2  Bi  u(k~0  +  t{k)  , 

i-1  i-1 

where  e{k)  is  a  correlated  noise  process  given  by 


(7a) 


e(k)  -  n(k)  +  ^  ai  •  (7b) 

i-1 

For  simplicity,  assume  a  large  signal-to-noise  ratio  (SNR)  in  which  case  e{k)  is 


negligible.  In  other  words,  iet 


(8) 


n  n 

><*)■-  S  ai  #-»)  +  2  5<  • 

»-l  i-1 

This  assumption  will  be  relaxed  in  section  III. 

The  coefficients  {  a,-,  Bi  }  will  be  chosen  to  best  fit  equation  (7)  in  the  mean 
square  error  sense,  by  performing  system  identification  based  on  the  observed 
outputs  y(k).  Then,  by  examining  the  numerator  coefficients  Bit  the  TDOA  can  be 
found.  It  should  be  noted  that  the  numerator  polynomial  found  in  this  manner 
will  not  be  unique,  since  without  having  direct  measurements  of  the  source  x{k) 
the  absolute  delays  (i.e.  the  degrees  Dj,  D2  of  the  polynomials  ftj,  b2  )  can  not  be 
determined.  However,  the  difference  in  the  degrees  of  the  polynomials  6j ,  b2  will 
be  unique  and  equal  to  the  TDOA,  A12  -  £>j  -  Z>2,  which  provides  the  desired 
information  about  the  source  bearing. 

Consider  now  the  case  where  multipath  propagation  is  present.  For  example, 
assume  that  sensor  1  received  a  delayed  version  of  the  direct  signal,  as  indicated  in 
Figure  2. 

In  this  case 

-  x(*-Dj)  +  Cj  xU-Dj-ij)  +  nj(A)  (9a) 

y2(k)  -  c2  x{k-D2)  +  n2U)  ,  (9b) 

where  Cj  represents  the  attenuation  he  indirect  propagation  path  and  5,  its 
additional  delay.  The  numerator  polynomials  in  this  case  will  be 

frj(z)  -  t’D I  +  Cj  z-{D i  *  (10a) 

b2(z)  -  c2  z'°2 


(10b) 


Thus,  the  multipath  propagation  introduces  additional  terms  in  the  numerator 
polynomials.  Note,  however,  that  these  additional  terms  will  always  have  a  higher 
degree  than  the  term  corresponding  to  direct  propagation,  because  of  the  additional 
delays  involved.  Therefore,  if  we  look  at  the  degrees  of  the  first  nonzero 
coefficient  in  b2(z)  (i.e.  the  coefficient  of  the  lowest  degree  term)  we  can  still 

find  the  TDOA,  AJ2  ■  Dj  -  D2  ! 

To  illustrate  this  procedure,  assume  that  we  performed  the  ARMA  fitting  on  a 
given  data  set  {y(k )}  and  found  the  numerator  coefficients.  As  plotted  in  Fig.  3  the 
first  significant  non-zero  coefficient  of  6j(z)  is  61)2,  and  the  first  non-zero 
coefficient  of  62(z)  is  624.  Thus,  the  TOA  difference  in  this  case  is  4  -  2  -  2.  The 
small  nonzero  coefficients  are  due  to  imperfect  modeling  because  of  measurement 
noise,  while  the  fairly  large  coefficients  following  &2  4  are  due  to  multipath 
propagation. 

Figure  3  also  provides  an  indication  of  what  will  happen  if  the  delays  are 
non-integer  multiples  of  the  sampling  period:  instead  of  having  a  single  nonzero 
coefficient  associated  with  a  given  delay,  we  will  have  two  large  coefficients 
whose  relative  magnitudes  reflect  how  close  the  real  delay  is  to  the  delay 
represented  by  that  coefficient.  (For  example,  if  k  A  t  <,  Tj  s  (fc+l)  A  t  we  may 
expect  both  k  and  ij  to  be  nonzero).  In  other  words,  as  long  as  the  sampling 
rates  are  sufficiently  high  compared  to  the  bandwidth  of  the  underlying  process, 
the  same  approach  will  work  for  non-integer  time-delays. 

This  approach  can,  of  course,  be  easily  generalized  to  the  case  when  there  are 
more  sensors.  In  the  case  of  M  sensors  we  have 
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B(z)  -  [  bx(z),  b2(z) . bM(z)  ]T  .  (11) 

Comparison  of  the  degrees  of  the  first  nonzero  terms  of  b^z)  and  bj(z)  will  give  the 
TDOA,  A jj  -  Dj  -  Dj  .  The  generalization  to  the  case  where  more  than  one  source  is 
present  is  more  complex  and  will  be  discussed  in  Section  III. 

Identification  Techniques 

The  discussion  so  far  has  illustrated  that  infromation  about  the  source  bearing, 
i.e.  TDOA,  is  contained  in  the  coefficients  of  the  ARMA  representation  of  the 
observed  measurements.  The  question  remains,  how  to  compute  these  ARMA 
coefficients  ( ait  Bi )  given  a  set  of  measurements  {  y( k)  }fcm^/(  . 

This  type  of  problem  has  been  widely  studied  in  the  context  of  estimation  and 
identification  of  linear  systems.  Finding  {  af,  fl*  }  in  equation  (8)  is  usually 
referred  to  as  "the  case  of  correlated  residuals".  A  simpler  version  of  this  problem  is 
the  following! 

Assume  that  the  measurements  y(k)  are  the  output  of  an  AR  system  driven  by 
white  noise  v(k) 

jK*)  -  -  2  ai  +  v(k)  .  (12) 

i 

Find  which  will  minimize  the  mean  square  error  E  { tjW  -  $fc)]2  }  where 

y(k)  *•  -  2  aj  The  solution  of  this  problem  is  fairly  straightforward  and 

i 

typical  algorithms  can  be  found  in  [9-10]. 
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As  can  be  seen  by  comparing  equations  (8)  and  (12),  in  the  source  location 

problem  v(k)  is  a  correlated  sequence  of  random  variables.  In  fact  it  has  the  form 

n 

v{k)  -  2  (13) 

where  u(k)  is  an  independent  white  noise  process. 

Identifying  the  system  when  the  sequence  of  v(k)  (also  called  the  residuals)  are 
correlated  is  somewhat  more  complicated.  Several  techniques  have  been  suggested 
to  deal  with  this  case,  including: 

(a)  repeated  least-squares 

(b)  generalized  least  squares 

(c)  the  maximum  likelihood  method 

(d)  instrumental  variables 

For  references  to  these  techniques  see  the  survey  by  Astrom  [9], 

More  recently  a  new  approach  has  been  developed  by  Morf  [11-13]  which 
provides  efficient  forms  of  the  so-called  exact  recursive  least  squares  algorithms, 
i.e.  RML  1  or  2  (see  e.g.  [14],  [15]).  These  new  forms  have  the  added  advantages  of 
computational  efficiency  and  fast  parameter  tracking  capability  [16-17].  The  last 
property  is  important  for  tracking  moving  sources,  since  the  ARMA  model 
corresponding  to  such  sources  has  time  varying  parameters.  We  will  discuss  this 
further  in  section  IV. 

Another  attractive  feature  of  the  exact  recursive  least-squares  algorithms  is  that 
they  do  not  require  stationarity  of  the  the  underlying  source  and  noise  process. 
This  is  in  contrast  to  the  type  of  algorithms  which  are  used  in  conventional  array 
processing  where  stationarity  is  often  a  necessary  assumption.  Furthermore,  these 


algorithms  preserve  their  computational  efficiency  under  nonstationarity,  while 
other  approaches  lead  to  solutions  of  increased  complexity  when  stationarity  is  lost 
(see  e.g.  [20-21]). 

The  details  of  these  (and  other)  algorithms  can  be  found  in  the  references  and 
the  appendices.  It  should  be  emphasized  that  the  significance  of  the  problem 
formulation  presented  earlier  (equation  (8))  is  that  it  makes  it  possible  to  apply  any 
identification  technique  to  the  source  location  problem.  The  algorithms  presented 
in  the  appendices  are  merely  examples.  A  comparative  evaluation  of  different 
identification  algorithms  in  the  context  of  the  source  location  problem  has  not  yet 
been  performed.  However,  it  is  our  belief  at  this  point  that  the  exact  recursive 
least  squares  algorithms  are  particularly  suitable  for  this  application.  Further 
discussion  of  these  algorithms  and  simulation  results  will  be  presented  in  future 
papers. 

The  identification  techniques  mentioned  above  provide  estimates  of  {  ait  Bi  }.  As 
mentioned  in  the  introduction  the  coefficients  a-  are  related  to  the  spectral 
"signature"  of  the  source.  In  fact  the  power  spectral  density  of  the  source  is  given 
by 

R{i)  "  a(z)  L(  \lz)  '  (14) 

Estimating  the  spectrum  by  autoregressive  modeling  is  considered  one  of  the 
better  spectral  estimation  techniques  (see  e.g.  [18-19],  popularly  referred  to  as 
"maximum  entropy"  spectral  analysis).  We  therefore  have  reason  to  expect  good 
performance  from  our  proposed  approach. 
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Fig.  4.  N  sources  and  M  sensors 
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III.  Multiple  AH  Sources 


In  many  applications  the  signals  received  by  the  sensors  are  generated  by  several 
sources.  Sometimes  we  are  interested  in  all  of  these  sources,  while  in  other 
situations  only  a  subset  of  them  is  of  interest;  the  others  are  merely  "interference". 
For  example,  suppose  we  want  to  detect  low  flying  aircraft,  but  the  sensors  also 
receive  noise  from  a  nearby  generator  or  truck.  The  method  described  here  is 
capable  of  simultaneously  estimating  the  bearing  (TDOA)  of  all  sources  in  the  area 
covered  by  the  sensors.  Thus,  the  existence  of  point  interference  sources  does  not 
represent  any  special  problems  in  this  approach.  The  interference  is  simply  another 
source  whose  parameters  are  to  be  estimated.  (A  good  example  of  "relabeling" 
sources  appears  in  the  estimation  theory  explanation  of  "noise  cancelling"  discussed 
in  [17]).  In  the  following  discussion  we  will,  therefore,  not  distinguish  between 
interferers  and  sources  of  interest. 

Consider  an  area  containing  N  sources  and  M  sensors  (Fig.  4).  Each  source  is 
assumed  to  be  of  an  autoregressive  type,  i.e. 

xAk)  -  -  a-j  x-(k-j)  +  Uj(k)  ,  i  -  1 . N  (15a) 

or  written  in  vector  form 

n 

x{k)  ■  -  S  .  (15b) 

y-i 

where 


n  -  max  {n,-} 


A j  -  diag  {  ai;-  }  . 

Written  in  the  z-transform  domain  we  get 

X(z)  -  A-Hz)U(z)  ,  (16a) 

where 

it 

A(z)  -1+2  A  •  z~i  -  diag  {  A  fa) }  ,  (16b) 

ik 

A  fa)  -1  +  2  aii  z'}  (16c) 

M 

In  the  previous  section  we  have  seen  that  the  propagation  from  a  single  source  to 
the  M  sensors  can  be  represented  by  a  polynomial  matrix  of  dimension  M  x  1 .  It  is 
easy  to  see  that  in  the  multi-source  case  we  can  write 

Y(z)  -  fl(z)  X(z)  +  N{z)  -  B(z)  A'Hz)  U{z)  +  N(z)  ,  (17) 

where  B(z)  is  an  M  x  N  polynomial  matrix  (and  X(z)  is  an  N  x  1  vector  of 
polynomials).  Each  column  of  this  matrix  represents  the  transfer  function  from 
one  of  the  sources  to  the  sensors. 

Using  the  same  reasoning  as  before,  we  may  conclude  that  the  i-th  column  of  B(z) 
provides  information  about  the  TDOA  for  signals  generated  by  the  i-th  source. 
Thus,  estimating  the  coefficients  of  fl(z)  will  essentially  solve  the  multi-source 
location  problem.  Estimating  /4(z)  will,  of  course,  provide  information  about  the 
spectrum  of  each  source. 

The  problem  of  locating  multiple  sources  and  finding  their  "spectral  signatures" 
can,  therefore,  be  formulated  as  an  identification  problem  of  a  multi-input  (N) 
multi-output  (M)  system,  see  also  Fig.  5.  All  the  techniques  which  have  been 
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developed  for  system  Identification  can,  at  least  in  principle,  be  applied  to  solve  this 
problem.  There  are,  however,  a  number  of  difficulties  in  the  multi-source  case, 
which  will  be  discussed  next. 


Matrix  Fraction  Descriptions  (MFD) 

In  equation  (17)  we  have  derived  a  transfer  function  of  the  form 

H(z)  -  B(zM'5(z)  .  (18) 

This  type  of  system  representation  is  called  the  Matrix  Fraction  Description  (MFD) 
of  a  multi-input  multi-output  linear  system.  The  properties  of  MFD  have  been 
extensively  treated  in  the  literature  of  linear  systems,  e.g.  [22-23],  Here  we  shall 
only  mention  some  of  the  basic  facts  which  are  relevant  to  the  source  location 
problem. 

(i)  Minimal  Realizations  and  Uniqueness 

The  MFD  representation  in  equation  (18)  is  clearly  nonunique.  If  T(z)  is  any 
polynomial  matrix,  we  can  write 

H{z)  -  B(z)  AA(z)  -  (  Biz)  T(z) )  (  A(z)  T(z)  )'1  (19) 

i.e.  the  numerator  and  denominator  matrices  can  have  arbitrary  "common  factors", 
called  common  divisors  in  the  matrix  case.  It  is  useful,  therefore,  to  restrict  our 
attention  to  the  so-called  minimal  realization  of  the  system  H(z),  which  is  a  pair  of 
matrices  4(z),  Biz)  which  have  no  more  "common  factors"  and  which  have  in  some 


sense  the  lowest  possible  "order".  It  turns  out  that  the  minimal  representation  of 


the  system  H{z)  is  still  nonunique.  The  nonuniqueness  is  due  to  the  fact  that  a 
minimal  representation  {A(z),  B(z)}  can  be  multiplied  by  any  unimodular  matrix  U 
and  remain  minimal  (i.e.  [A(z)U ,  B(z)U}  is  still  a  minima;  “epresentation). 
Unimodular  matrices  are  polynomial  matrices  whose  inverses  are  also  polynomial. 
They  are  the  analog  of  a  constant  multiplier  in  the  single-input  single-output  case 
where  a  transfer  function  b(z)la(z)  can  be  uniquely  determined  only  up  to 
multiplication  by  a  constant  c,  i.e.  b(z)/a(z)  =  c  b(z)lca(z). 

Nonuniqueness  of  this  type  may  "mix  up"  the  rows  and  columns  of  B(z),  making 
it  difficult  to  extract  the  TDOA  information.  If  we  assume  the  sources  to  be 
independent,  the  denominator  A(z)  in  the  source  location  problem  has  a  very  special 
structure,  namely  it  is  a  diagonal  matrix.  Therefore,  if  we  will  choose  that 
particular  minimal  realization  which  has  a  diagonal  denominator  matrix,  we  can 
get  a  unique  MFD  representation  of  the  source/sensor  system  (under  some  weak 
assumptions  which  will  be  met  in  all  the  applications  considered  here).  In  other 
words,  the  special  structure  of  the  source  location  problem  can  be  exploited  to 
define  a  unique  representation  of  the  system. 

(ii)  Left  and  Right  MFD's 

The  form  of  the  transfer  function  as  given  in  equation  (18)  is  only  one  of  two 
basic  forms  of  MFD's.  An  alternative  representation  is 

H(z)  =  ;Hz)fl(z)  ,  (20) 

A  A 

where  /4(z)  is  an  M  x  M  polynomial  matrix  and  B(z)  is  M  x  N .  (  A{z).  Biz)  )  and 
(  A(z),  B{z)  )  are  called  Right  MFD  and  Left  MFD,  respectively. 
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Currently  available  identification  techniques  provide  ways  for  estimating  ARMA 
coefficients  and  will  therefore  give  estimates  of  the  Left  MFD(  A{z),  B(z) ).  In  order 
to  find  the  TDOA,  we  will  first  have  to  find  the  Right  MFD  of  the  system 
(  A{z),  fl(z)  ),  given  estimates  of  the  Left  MFD.  Various  techniques  for  going  from 
Right-  to  Left  MFD's  are  available.  Recently,  some  algorithms  with  good  numerical 
properties  have  been  reported.  Note,  however,  that  in  this  case  we  want  to  go  from 
a  Left  MFD  to  a  Right  MFD  with  a  specific  structure  (namely,  diagonal  A(z)).  The 
question  of  how  to  do  this  in  a  reliable  and  efficient  way  is  not  completely  resolved 


at  this  time. 


A  more  direct  approach  would  be  the  development  of  estimation  algorithms 
(  A(z),  B(z) )  directly,  i.e.  identifying  the  Left  MFD.  No  simple  algorithms  seem  to 


be  available  and  further  work  is  needed  in  this  area. 


Most  of  the  difficulties  mentioned  hare  did  not  arise  in  the  single-source  case. 
This  was  due  to  the  fact  that  in  that  case  the  Left  and  Right  MFD’s  have  a  very 
simple  relationship! 

H(z)  -  BWWz)]’1  -  [a(z)  Iw  r1  B(z)  -  A{z)A  B(z)  .  (26) 

where  1^  is  the  M  x  M  unity  matrix.  The  ARMA  model  fitting  procedure  gives 
direct  estimates  of  the  Left  MFD  in  this  case,  since 

n  it 

y(k)  -  -  2  W  #"*)  +  2 

i-1  i-1 

-  -  2  +  S  S*  u(A-f)  .  (27) 

. i-1  i-1 

In  other  words,  B(z)  -  B(z).  A(z)  -  a(z)  and  A(z)  -  a(z). 


In  the  previous  discussions  we  assumed  a  high  SNR  situation  where  the 
measurement  noise  was  ignored.  It  is  important  to  note  that  the  measurement  noise 
can  be  embedded  in  an  ARMA  model  with  a  higher  dimensional  output  vector,  i.e. 

>i  it 

y{k)  -  -  2  ^i  #-0  +  £  u(A-i)  +  e{k)  , 
i-1  i-1 

where 

it 

e(k)  -  n(k)  +  2  ^i  > 

I**  1 


can  be  rewritten  as 


#)-  -  2  +  2  lBitAi) 

i=l  i-0 


li(k-i) 

n(k-i) 


ZQj 


where 


Bq  ■  0,  Aq  m  I  >  Cq  *  [  0,  I  ]. 

The  resulting  identification  problem  is  that  of  an  ARMA  model  with  a  white 
driving  process  w(k)  and  no  measurement  noise  term.  An  added  complication  is  the 

A 

special  structure  of  the  matrix  C-,  i.e.  the  Identification  algorithm  will  have  to 

A  A 

guarantee  that  the  last  M  columns  of  Cj  will  be  identical  to  the  columns  of  A This 
requires  some  modifications  of  the  identification  algorithms  presented  in  the 
appendices. 


It  should  be  noted  that  the  more  classical  approach  of  beamforming  is  naturally 
related  to  the  right  MFD  representation,  while  the  identification  approach  is  related 
to  the  left  MFD.  In  the  single  source  case,  beamforming  consists  of  forming  an  inner 
product  of  the  data  vector  K(z)  with  a  "steering"  vector  V(z),  to  get  a  scalar  signal  S(z) 
which  can  then  be  processed  further  (e.g.  to  get  optimal  detection), 

$(z)  -  V(z?  Y(z)  -  V{z)y  B(z)  X[z) 

The  steering  vector  is  chosen  so  that  it  "removes"  the  numerator  polynomial  which 
represents  the  propagation  model,  i.e.  we  choose  V(z)y  so  that 

Y(zf  B(z)  -  z-A 

and  then  the  received  signal  S{z)  is  just  a  delayed  (with  some  delay  A)  version  of 
the  source  signal  X{z).  To  see  this  more  clearly,  consider  the  B(z)  given  in  equation 
(6b).  In  that  case 


.  • .'i  •-*  .v. 


In  the  multiple-source  case  ^(z)  will  be  a  matrix  which  is  chosen  so  that 

Hz)1  B(z)  -  diag  [  z A  }  .  (29) 

The  numerator  polynomial  B(z)  associated  with  the  steering  vector  V(z),  is  that  of 
the  right  MFD.  Thus,  the  operation  of  beamforming  can  be  thought  of  as  finding 
the  ’’inverse”  of  B{z),  in  the  generalized  sense  of  equation  (29). 


Determining  the  Number  of  Sources 

In  order  to  perform  system  identification,  we  first  have  to  determine  the 
dimensions  of  the  system,  i.e.  the  number  of  inputs  and  outputs.  The  number  of 
outputs  is,  of  course,  the  number  of  sensors  M  and  is  Known.  However,  the 
number  of  inputs  is  equal  to  the  number  of  sources  N,  which  may  be  unknown,  or 
known  only  imprecisely.  Therefore,  it  will  be  necessary  to  repeat  tht! 
identification  process  assuming  different  numbers  of  inputs.  By  observing  the 
resulting  mean  square  errors  (i.e.  model  fitting  errors)  the  number  of  sources  N  can 
be  estimated.  Typically,  there  will  be  a  sharp  drop  in  the  error  power  (or  other 
related  measures)  when  the  system  dimension  exceeds  N. 

The  following  procedure  is,  therefore,  suggested: 

(i)  Get  an  initial  estimate  of  the  number  of  sources.  Such  an  estimate 
can  be  obtained  from  some  preprocessing  of  the  sensor  data  and 
by  keeping  track  of  the  number  of  sources  entering  and  leaving  the 


area  of  interest. 


(ii)  Choose  the  system  input  dimension  to  be  larger  than  the  estimated 
number  of  sources  (i.e.  overestimate). 

(iii)  Perform  system  identification  and  observe  the  error. 

(iv)  Decrease  the  input  dimension  by  1  and  repeat  (iii). 

(v)  Stop  when  a  sharp  increase  in  error  is  observed.  Choose  the  previous 
dimension  to  be  the  number  of  sources.  Use  the  corresponding  system 
parameters  (^^z),  B^(z))  to  estimate  source  location  and  "signature". 

If  good  initial  estimates  of  the  number  of  sources  is  available,  the  amount  of 
additional  computation  will  be  minimized.  It  is  also  possible  to  derive  algorithms 
that  are  recursive  on  the  assumed  number  of  sources,  such  that  computational 
savings  can  be  made.  Other  schemes  for  tackling  this  problem  are  possible, 
especially  in  the  case  of  moving  sources  observed  over  a  sufficiently  long  period  of 
time.  Details  will  be  presented  in  future  papers. 


IV.  Some  Extensions 


The  system  identification  approach  can  be  applied  to  a  wider  class  of  problems 
than  those  discussed  so  far.  In  particular,  it  need  not  be  restricted  to  AR  sources  and 
MA  propagation  models.  In  this  section  we  present  several  interesting  extensions 
of  these  problems  and  indicate  how  the  proposed  approach  can  be  used  to  solve 
them. 


IV.  1.  ARMA  Sources 


Many  sources  of  interest  can  be  adequately  modeled  as  the  output  of  an  AR  model 
driven  by  white  noise.  (In  fact  any  source  with  rational  spectral  density  function 
can  be  represented  this  way,  provided  that  the  order  of  the  model  is  sufficiently 
high).  However,  it  is  sometimes  necessary  to  consider  ARMA  models  rather  than  AR 
models.  To  see  the  kinds  of  difficulties  that  arise,  consider  multiple-source  vector 
x  ,  where 


X(z)  -  B(z)  At-\z)V(z) 


(30a) 


and 


As{z) 

■  diag  {  af(z) }  , 

(30b) 

Bt(z) 

-  diag  {  bj(z) )  . 

(30c) 

The  subscript  ;  indicates  elements  of  the  transfer  function  associated  with  the 
source  models.  As  we  have  seen  before,  the  propagation  model  is  given  by  a 
polynomial  matrix  representing  the  delays,  attenuations  and  possible  multipath 


propagation  for  the  sources  to  the  sensor.  We  shall  denote  this  matrix  by  Bp(z)  . 
The  sensor  output  vector  is 

YU)  -  BpU)  XU)  +  N[z)  -  BpU)  BtU)  A-Ht)  VU)  +  N(z) 

BpU)  BtU )-  BU) 

Note  that  the  numerator  matrix  fl(z)  is  a  product  of  the  propagation  model  ( Bp )  and 
the  source  MA  part.  In  order  to  recover  the  TDOA  information  it  is  necessary  to 
factor  B  into  its  components  Bp  ,  Bt .  This  is  possible  assuming  a  diagonal  structure 
for  BtU) ,  i.e. 

B,U)  -  diag  {  bt  i(z) )  . 

Because  of  this  structure,  all  the  elements  in  a  column  of  BtU)  have  a  common 
factor.  In  fact  b,  ,(z)  will  be  a  common  factor  of  the  elements  of  the  l-th  column  of 
BU)-  Therefore,  the  following  procedure  could  be  used  to  perform  TDOA  estimationi 

(i)  Estimate  A(z),  BU)  from  the  available  data 

(ii)  For  each  column  of  B{z)  find  the  common  factors.  This  could  be 
done  by  computing  the  roots  of  the  polynomials  and  looking  for 
"root  clusters".  The  columns  of  B(z)  with  the  common  factors 
taken  out  will  form  BpU). 

(iii)  Estimate  TDOA  from  BpU)  as  in  the  AR  case. 

Note,  however,  that  the  TDOA  information  is  embodied  in  differences  of 
polynomial  degrees  in  a  given  column.  Since  differences  in  degrees  are  unchanged 
by  multiplication  of  all  the  elements  with  the  same  polynomial,  it  may  be  possible 
to  extract  the  desired  information  directly  from  B(z).  This  is  easy  to  see  for  the  case 
of  no  multipath.  To  illustrate  this  consider 


y*>  - 

Z'D1  # 

,  Bt(z)  - 

bt>l{z)  0 

_  cz'^2  *  _ 

.  0  6j>2(z)  . 

where  *  are  entries  of  no  interest  to  this  discussion,  and  with 

\l<2)  "  So  +  Si  t'1  +  •  •  •  +  Sm  z'm 
Then  the  first  column  of  B(z)  -  Bp{z)  Bt{z)  will  be 


'  z-ci  ‘ 

V 

g0z-Di  +  gxz-Drl  +  . 

cz-D2bt  l(z)  _ 

- 

.  cg0z'°2  +  C^jZ-^'1  +  . 

•  ♦  v”2'” . 

Ola) 


(31b) 


(32) 


Note  that  if  we  look  at  the  first  nonzero  terms  of  these  polynomials  (  g0z'Dl  , 
cgQZ~D2  )  and  take  the  difference  of  their  degrees,  we  get  the  correct  TDOA  estimate 
(D2-D,  ). 


In  summaryi  having  an  ARMA  source  model,  rather  than  a  pure  AR  model  does 
not  significantly  complicate  the  TDOA  estimation  problem. 


IV, 2.  ARMA  Propagation  Models 

The  simplest  form  of  multipath  propagation  is  caused  by  a  signal  being  reflected 
by  some  object  which  lies  outside  the  direct  line  of  propagation,  causing  an 
extraneous  delayed  version  of  the  signal  to  arrive  at  the  sensors.  This  type  of 
multipath  propagation  can  be  represented  by  a  MA  model  of  the  type  used  in  earlier 
sections  (  Bp(z)  ).  Often,  a  more  complicated  type  of  propagation  occurs.  The  signal 
can  undergo  multiple  reflections  of  the  type  encountered,  for  instance,  when  a 
sound  wave  propagates  in  a  room  (e.g.,  the  "cocktail  party"  problem:  locating  people 
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who  are  speaking  Inside  a  room).  The  sound  is  reflected  from  one  wall,  bounces  off 
the  opposite  wall  and  again  from  the  first  wall.  Similar  effects  occur  when  seismic 
waves  propagate  in  the  earth  or  when  sound  propagates  in  the  ocean  (reflected 
from  the  water/air  and  water/ocean  floor  interfaces).  This  phenomena  is  best 
modeled  by  an  AR  type  model. 

In  general,  a  realistic  propagation  model  will  combine  both  types  of  multipath 
propagation  and  will  therefore  be  an  ARMA  model,  i.e. 

Y(z)  -  Bp(z)  Ap\z)  X(z)  +  N(z)  .  (33) 

The  structure  of  the  propagation  model  depends,  of  course,  on  the  particular 
application  being  considered  and  on  the  physical  properties  of  the  propagation 
medium.  In  its  simplest  form  Ap(z)  will  have  a  diagonal  form,  meaning  that  the 
propagation  from  a  given  source  to  all  sensors  has  the  same  kind  of  propagation 
effects,  except  for  different  delays  which  are  represented  in  Bp(z).  In  general,  more 
complicated  forms  of  propagation  are  possible  in  which  case  Apiz)  will  be 
non-diagonal. 

Estimating  the  source  location  and  spectrum  requires  estimating  the  system 
transfer  function  Biz)  /4*'(z)  and  then  "factoring  out"  Bp(z)  and  B$(z)  As~\z)  .  We 
have  seen  how  this  can  be  done  for  MA  propagation  models.  The  situation  is  more 
complicated  in  the  ARMA  case,  but  under  certain  assumptions  it  can  still  be  done. 
To  see  this  consider  first  the  simpler  case  where  the  source  model  is  purely 
autoregressive,  and  we  also  assume  that  Atiz),  Ap(z)  are  diagonal, 

H(z)  -  Bpiz)  ApHz)  A'Hz)  -  Bpiz)  [  Atiz)  Ap(z)  ]-»  . 


(34) 


The  TDOA  estimates  can  be  found  directly  from  B(z)  •=  B  (z)  ,  as  before.  The 

r 

factoring  of  A(z)  into  Afo  and  Ap(z)  is  not  possible  in  general.  However,  in  many 
cases  of  interest  it  may  still  be  performed.  Consider  the  situation  where  the  time 
dependence  of  the  source  spectrum  and  the  propagation  model  are  different.  For 
example,  consider  a  nonstationary  source,  like  speech,  which  stays  fixed  in  space 
so  that  the  propagation  model  is  time-invariant.  By  repeating  the  computation  of 
A(z)  over  several  time  intervals  and  computing  its  roots,  Ap(z)  will  be  found  from 
those  roots  which  do  not  change,  while  At(z)  will  correspond  to  the  roots  which  are 
varying  from  one  time  interval  to  another.  As  another  example,  consider  a 
statistically  stationary  source,  which  is  moving,  causing  the  propagation  model  to 
change.  Here  A.(z)  will  correspond  to  the  fixed  roots  of  /4(r)  and  A  (z)  to  the 

r 

changing  roots. 

If  either  At(z)  or  Ap(z)  are  non-diagonal,  the  nonuniqueness  of  the  MFD 
representation  will  cause  some  difficulty  in  getting  the  TDOA  information  from 
B(z).  The  question  of  getting  the  proper  "canonical"  form  of  B(z),  A(z)  for  a  general 
nondiagonal  A{z)  is  not  completely  resolved  at  this  time. 

When  both  the  source  model  and  the  propagation  model  are  of  the  ARMA  type, 
the  system  transfer  function  is  given  by 

H (z)  .  Bp(z)  Ap'Hz)  Bt(z)  AtA{z)  .  (35) 

In  this  case  the  relationship  between  A(z),  B(z)  and  A  Jz),  A,(z),  BJz),  B,(z)  is  not  as 
straightforward  as  before.  Under  certain  assumptions,  the  source  and  propagation 
models  can  still  be  factored.  We  will  not,  however,  discuss  this  case  here. 


IV, 3,  Direct  Estimation  of  Source  Location 


The  set  of  TDOA’s  contains  all  the  information  required  to  locate  the  sources,  i.e. 
source  bearing  and  range  or  source  coordinates.  Thus,  estimating  the  source 
location  can  be  performed  in  two  steps:  First  estimate  the  TDOA's  (  )  using  the 

method  described  in  the  previous  section.  Then  compute  the  source  location  from 
the  set  of  A,y's  using  the  approach  presented  in  [1,2].  The  relationship  between 
the  source  coordinates  (or  equivalently:  its  range  and  bearing)  and  the  TDOA  is 
given  by  the  problem  geometry.  For  each  pair  of  sensors  in  2-D  plane,  a  given  TDOA 
corresponds  to  a  source  located  on  a  hyperbolic  line  of  position.  Different  pairs  of 
sensors  and  their  TDOA  define  different  hyperbolic  lines  of  position.  The  source 
has  to  lie  on  the  intersection  of  these  various  lines.  These  geometric  facts  can  be 
translated  into  algorithms  for  computing  source  location,  as  outlined  in  [  1 ,2]. 

A  different  approach  would  be  to  try  and  estimate  directly  the  source 
coordinates.  Since  the  functional  relationship  between  the  source  coordinates  and 
the  TDOA's  is  known,  it  is  possible  to  re-formulate  the  problem  as  an  estimation 
problem  for  those  coordinates.  The  resulting  equations  are,  however,  quite 
complex  and  possible  solutions,  especially  the  ones  that  lend  themselves  to 
distributed  computing,  are  still  under  investigation. 


IV.4.  Moving  Sources 


When  the  sources  and  the  sensors  are  fixed  in  space  the  propagation  model  will 


usually  be  time-invariant.  (This  is  not  necessarily  true,  since  the  propagation 
media  may  be  nonstationary,  but  for  the  moment  we  ignore  this  case).  Thus,  the 
corresponding  ARMA  coefficients  will  be  constant.  If  there  is  source  motion,  the 
ARMA  coefficients  will  be  time-varying.  Provided  that  these  changes  are  not  fast 
compared  to  the  system  sampling  rate,  the  identification  algorithms  will  be  capable 
of  tracking  the  system  parameter  changes  and  thus  the  source  location. 

Another  approach  to  the  moving  source  problem  is  to  assume  a  priori  a  specific 
form  for  the  time  variation  of  the  model  and  thus  reduce  the  problem  again  to 
estimating  constant  coefficients.  For  example,  the  MA  coefficients  of  the 
propagation  model  may  turn  out  to  have  a  polynomial  form 

8(<r>  -  By  *  fly(  ♦  By  I2  .  (36) 

The  coefficients  B|)0,  fl-  j,  Bi2  of  this  time  polynomial  can  be  related  to  the  source 
location  at  some  reference  time,  its  velocity  and  acceleration.  Thus  estimating  the 
time-varying  ARMA  coefficients  will  provide  direct  estimates  of  moving  source 
parameters.  The  proper  form  of  A.(t),  B^t)  and  the  resulting  algorithms  are 
currently  under  investigation. 

In  many  applications,  the  signals  received  from  moving  sources  undergo 
significant  doppler  shifts.  These  frequency  shifts  provide  information  about  target 
velocity  and  about  its  location  with  respect  to  the  sensors.  Source  location  can  be 
estimated  by  looking  at  the  differences  in  frequencies  of  arrival  (FDOA)  of  the 
signals.  The  estimation  of  target  location  from  the  set  of  FDOA's  is  analogous  to  the 
estimation  from  TDOA's  which  was  described  earlier. 

The  estimation  of  FDOA’s  fits  nicely  into  the  system  identification  framework 
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developed  in  earlier  sections.  To  see  this  consider  a  single  AR  source  of  the  type 


X(z)  -  ^  V(z)  . 
The  signals  received  from  this  source  are  given  by 

Y(z)  -  Hi z)  Uiz)  +  Niz)  , 


with 


and 


Hiz) 


6j(z)/aj(z) 

bMiz)laMiz)  . 


(37) 


(38a) 


(38b) 


af(z)  -  aiz-z^  ,  (38c) 

where  zi  corresponds  to  the  doppler  shift  from  the  source  to  sensor  t.  Estimates  of 
the  coefficients  of  a-iz)  can  be  obtained  by  using  identification  techniques  of  the 
type  described  earlier.  The  doppler  shifts  (i.e.  FDOA’s)  can  then  be  found  from  these 
coefficients  as  indicated  by  equation  (38c). 


In  the  multisource  case  Viz)  will  be  a  vector  and  Hiz)  will  have  the  form  of  a 
matrix  with  rational  polynomial  entries 

Hiz)  -  [  b'jiz)  I  aijiz)  j  1st  SM,  \S]<N.  (39) 

The  coefficients  of  the  denominator  polynomials  a-(z)  contain  the  FDOA  estimates, 
while  the  coefficients  of  numerator  polynomials  include  the  TDOA  etimates.  The 
details  required  to  substantiate  these  statements  will  be  described  elsewhere,  but 
the  basic  ideas  can  be  seen  from  the  single  AR  source  case.  Note  that  if  the  doppler 
shifts  are  negligible,  the  system  transfer  function  Hiz)  reduces  to  the  form  which 
was  presented  earlier;  the  columns  of  Hiz)  have  a  common  denominator  polynomial, 


i.e. 


aij(z)  -  aj(z)  (40) 

and  therefore, 

H(z)  -  B{z)  A'](z)  ,  A{ z)  -  (Hag  {  a j{z)  }  .  (41) 

This  equation  is  identical  to  equation  (18)  which  was  derived  while  ignoring 
doppler  shifts. 
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V.  Conclusions 


In  the  previous  sections  we  presented  a  linear  systems  framework  for  treating 
the  source  location  and  spectral  estimation  problem.  We  believe  that  the  proposed 
approach  has  potential  advantages  over  currently  used  techniques,  particularly  in 
situations  involving  multiple  sources  and  multipath  propagation.  Application  of 
this  approach  to  real  data  involves  a  myriad  of  practical  issues  which  were  not 
addressed  in  this  paper.  These  include.* 

Optimal  detection  and  TDOA  estimation  schemes  based  on  the  coefficients 
of  B  (z).  Simple  thresholding  and  interpolation  (to  refine  the 
bearing  resolution'  are  probably  suboptimal. 

Probability  of  detection  (  )  and  probability  of  false  alarm  (  Pya  ) 

curves  parametrized  by  signal-to-noise  ratios  (  SNR  )  for  the 
proposed  algorithms, 

Bounds  for  the  bearing  resolution  achieveable  by  this  approach. 

Validation  of  the  proposed  approach  by  computer  simulation. 

These  and  other  practical  issues  will  be  addressed  in  future  papers,  including  the 
presentation  of  the  computer  simulations  in  progress. 

The  source  location  problem  also  served  us  as  a  prototype  for  motivating 
research  into  several  system-theoretic  areas,  including: 

-  Development  of  efficient,  numerically  stable  algorithms  for  going 
from  Right-MFD  to  Lef t-MFD  and  vice  -  rsa. 

Algorithms  for  direct  estimation  of  Left-MFD's. 


Uniqueness  and  identifiability  of  multi-input  multi-output  MFD's. 
Identification  of  ARMA  models  with  time  varying  parameters. 

Identification  of  ARMA  models  with  special  structure  (e.g.  the  low 
signal  to  noise  case  in  section  III). 

Identification  of  systems  with  certain  types  of  nonlinearities 
(e.g.  those  appearing  in  the  direct  estimation  of  the  source  location, 
see  section  IV.3). 

These  topics  are  currently  under  investigation  and  some  partial  results  were 
obtained.  The  details  will  be  presented  elsewhere. 

The  approach  outlined  in  this  paper  makes  it  possible  to  apply  a  whole  range  c  . 
techniques  developed  for  system  identification  to  the  source  location  problem. 
Improved  performance,  the  possibility  of  handling  multipath  and  simultaneous 
estimation  of  TDOA  and  source  spectrum  are  expected  features  of  this  approach.  In 
addition,  we  have  seen  that  the  system  identification  framework  leads  to  a  number 
of  interesting  system  theoretic  questions. 
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Appendix  As 


Example  of  Recursive  Identification  Algorithms 


Perhaps  the  most  basic  and  most  commonly  used  identification  scheme  is  the 
least  squares  (LS)  method.  This  method  is  archetypical  for  several  others,  and  in 
particular  for  those  to  be  considered  here.  For  a  thorough  treatment  of  this  subject 
see,  e.g.  [9],  [10],  [14],  [15]. 


Problem  Statement 


Given  measured  input  output  data  we  are  interested  in  fitting  an  ARMA  model  to 
this  data,  as  discussed  in  section  II.  Recall  that  the  ARMA  model  parameters  are 
given  by  the  matrices  Ak  and  Bk  ,  hence  they  are  to  be  determined  from  recorded 

input-output  data  (yit),  u((),  t  *  l . A').  The  A  parameters  represent  the 

autoregressive  part,  whereas  the  B  parameters  are  the  moving-average  part. 


The  LS-  (Least-Squares)  Method  amounts  to  choosing  the  parameters  Ak,  Bk  so 

that  the  "equation  error"  e(t)  (see  e.g.  equation  7  in  section  II)  is  minimal 

N 

min/ik,Bk  E  moh2  ■ 

/V 

"W/l  /)  2  MO  +  A^yit-D  +  .  .  ,  +  Ary(t-r)  -  Bjufr-l)  -  .  .  .  -  Bf  u{t-r)  ||  2 


We  introduce  the  following  notation,  denote  transpose  by  T  and  define  s  -  p  +  m 
and  q  -  rs  ,  then  the  input  output  data  is  given  by  the  vector  of  the  measurements 

-  [}(f-I)T,  u(/-l)T,  ....  y(t-r)\  u(f-r)T  ] 

(  a  <7  by  i  vector  ) . 

And  the  model  parameter  vector  is  given  by 

A[|,r]T  -  [-Aj,  fliT . -V-  Br  J 


•V  ■ W  V  W"1  \- 


(  a  p  by  q  -  matrix  ) . 


Then  the  output  equations  can  be  written  in  vector  form  as 

y{t)  -  1 ,  r]T  y[t-\,t-r]  +  e{t) 

and  we  wish  to  minimize  the  equation  error  by  minimizing 

N 

min  A  £  II  y(t)  -  A[  l ,  r]T  y[t- 1 ,  f-r]  ||2 

i-l 


Since  (2)  is  a  quadratic  function  in  At  in  order  to  find  the  unknown  parameters 
it  is  easy  to  give  the  minimizing  A  at  time  N,  denoted  by  A explicitly  via  the 
so-called  normal  equations 

Rn  C  I.  -V  ]T  .  [  ReN,  0 . O  ]T 

or  rn-  an  m  rn- 

If  the  inverse  of  or  R^.  exists,  we  can  obtain  an  equation  for  the  estimates  of 
the  parameter  vector  that  minimizes  the  equation  error  (2) 

Ai V  *  [  RN-  J*1  rN’ 

where  with  q  *  rip  +  m)  and  q  *  s  -  (r  +  iXp  +  m) 

N 

RN  -  2  ylt,  t-r]  y[t,  <-r]T  (q  +  sbyq  +  s  -  matrix  ) 

i-l 


The  parameter  estimates  are  updated  by  increment  that  is  proportional  to  e(t+ 1),  the 
error  in  predicting  the  next  output  }(r+  i). 


r<*l 


}<+1T  -  tfM+l-rf  At 


-  y[f+l,i+l-r]T  [  I,  -AJ  ]T 

The  gain  k°  is  obtained  similarly  via  the  prediction  error  covariance  P.  of  the 
parameters . 

*Vl  '  Pt*l  "  kt*l  ^t* l'1 

A(+1  -  Pt  y[(,t+\-r] 

prediction  error  covariance  of  the  data?t+1 


t*l 


i*l 


1  +  y[t ,  t+  l-r]T  Pt  y[r,M-i-r] 

1  +  yti,  f+l-r]T 

■  pt  -  kt*\  ktJ 


where  the  recursions  for  Pt  are  recursions  for  the  inverse  of  R  ,  the  sample 
covariance  of  the  input  output  data,  since 

pt  •  t  r1 


The  above  set  of  equations  can  be  regarded  as  a  state  estimator  or  Kalman  filter, 
where  the  state  is  a  vector  of  the  constant  parameters  in  A  that  are  to  be  identified 
or  estimated.  The  output  equation  of  the  model  is  given  by  equation  7  in  section  II, 
and  the  state  equation  is  trivial  since  the  state  is  a  constant  vector.  For  vector 
measurements  {p  >  1)  the  equations  above  differ  actually  from  the  usual  state 
estimator  equations  since  then  the  "state"  is  a  p  by  q  matrix  instead  of  a  pq  by  1 
vector,  which  has  certain  implications  on  the  dimensions  of  P  ,  (here  a  q  by  q  and 
not  a  pq  by  pq  matrix);  however  it  is  not  very  hard  to  recast  this  convenient  form 
into  the  commonly  used  state-space  format,  hence  using  all  the  available 
techniques  for  analysing  and  implementing  such  algorithms. 


— 


Appendix  B: 

Pole-Zero  or  Autoregressive  Moving-Average(ARMA)  Forms 

In  this  appendix  we  discuss  fast  algorithms  for  pole-zero  or  autoregressive 
moving-average  (ARMA)  modeling.  These  forms  have  many  advantages  over  the 
forms  presented  in  Appendix  A,  see  example  [13],  [16],  [17].  They  appear 
especially  well  suited  for  real-time  applications  using  fast  pipeline  type  processors. 


B.  1  Joint  Innovations  Representation  of  the  ARMA  Process 

Given  a  pole-zero  (ARMA)  model  of  the  type  discussed  in  section  II  and  in 
Appendix  A  its  equations  can  be  rewritten  as 


yt  +  A\yt-i  + 

•  •  • 

+  Al\ Oi-N  ~  fll 

Vi  - 

...  -  BNur 

-N  "  fl0  ut 

and  in  matrix  notation,  we 

i  have 

3N 

yt  -  &V 

u« 

-  • 

with 

V 

■ 

c  im . 

A2’ 

•  •  •  •  an 

3. 

bV 

w 

^  *  ®1’ 

b2, 

•  •  •  -  Bn 

3. 

ytJ 

-  t  v .  • 

•  yt-n  ^  ’ 

u.T 

*  [  u,T  ,  ■ 

•  ut-n  ^  ’ 

leading  to  the  following  augmented  equation 

f  V  -oy  1  f  y,  1  •  fv< 


[0  V  J  [“.J  [“■  J. 

where  50  is  the  first  m  by  m  block  unit  vector. 

This  embedded  model  can  be  interpreted  as  a  2-channel  AR  model  of  the  joint 
process  {  yf,  u{  }.  Here,  the  right  hand  side  of  the  augmented  equation  is  equal  to 


...  ,.R.1 
»  "  •  '  »  *  »  *  «  *  -  •  * 


AAA  Sm 


the  joint  innovations  or  prediction  errors  of  {  yt,  u{  },  since 


B0ut 


L€ « J  Lu*  -  L  u«  J  • 

Indeed  if  we  apply  a  simple  interleaving  permutation  of  (1,  3,  5,  , 

2A/+1,  2,  4,  6,  .  .  .  2A/+2 )  to  the  augmented  equation,  we  have  a  2-channel 
one-step  predictor  of  the  autoregressive  form 

AW  zt  "  (t  • 


where 


/  0  A ,  -Bi 

mm  11 


\°m  U  I0m  0 


AN  -Bj y 


0m. 


ut . *-,VT-  Ut-N  5  ' 


T  T 


If  the  covariance  of  the  input/output  data  is  given 

the  problem  of  finding  the  linear  least-squares  predictor 

is  reduced  to  solving  a  normal  equation  of  the  following  form 

Ry*N  An  -  [  r*n  ,  0,  0,  .  .  .Of  , 

where  the  joint  covariance  is  given  by  (  E  is  the  expection  operator) 

[«,“o  r**n] 


RyUlJ  Ryuo 


RyUN  ■  E  Zi  V 


L  RyuN 


■ 


Ry*n  *  £|j,|  bt-n  ut-n  1  • 


and  the  prediction  error  or  innovation  covariance  is  defined  as 

R'n  -  *  «,  «,' 
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mRy{0)  V0)" 

Ryin)  RyJ^n) 

Ry*  - 

*  n 

*  (0)T  /m 

yu'  '  m 

Om  Om 

m  m 

m  J 

Since  Ry*N  has  a  special  structure,  i.e.  a  block  Toeplitz  matrix  with  blocks  of 
size  2m  by  2m,  the  solution  to  this  normal  equation  can  be  obtained  via  fast 
algorithms,  see  e.g.  [13],  These  algorithms  again  are  adapted  to  efficient  computer 
Implementations. 


B.2  Least-Squares  Recursions  for  ARMA  Models 


The  variables  of  interest  are  in  this  case  the  joint  forward  and  backward 
prediction  errors  of  order  n  at  time  T  is  defined  as 


\y  T* 

n,T 

“a 

>T  -  yn T-l^T-n 

in,Tm 

m 

-  An  J  XTT-n] 

«u  r 

L  n’T  J 

A 

uT  -  uri7Mr  r.„ 

y T-n  ~  yT-n)T-n*\,...,T 


\T 


uT,„  -  “r-nir-n*i,..,rj 


Bn,T  *TT-n) 


The  covariance  matrix  for  the  joint  process  is  thus  similar  to  that  given  in 
Section  B.  1  and  taking  advantage  of  its  special  structure,  one  can  obtain  recursions 
for  the  2-channel  AR  case,  i.e. 

An*l, T  ’  An,T  ~  D\*\,TR'rn,T-\  Bn,T-\ 

Bn*[,T  ’  Bn,T- 1  "  Dn*\,T  R'*n*\,T  An,T  ' 


In  the  two-channel  case,  #J|+}(y,  the  partial  correlation  matrices  exhibit 
additional  structures,  due  to  the  embedding  of  the  white  input  process  of  the  ARMA 
model  into  an  AR  model.  Since  all  future  inputs  are  assumed  to  be  uncorrelated 


with  the  present  and  past  outputs,  half  the  elements  Dn+|  ^  matrix  are 
approximated  by  zeros.  Indeed  the  partial  correlation  matrices  are  equivalent  to  the 


ARMA  model  parameter  estimates,  see  e.g.  [19],  and  they  are  now  given  by 


ut-n*l(yn,7^t)T 


The  time-updates  for  these  partial  correlations  are  obtained  in  a  similar  manner 


as  in  the  single  channel  case 


r*nTT 

AV,,r.i  •  r  -  I  ■ '  ;-y-  > 

n-i,r 

r“  rT 

■  AVi,r  *  [  '_T  ) 


In  order  to  obtain  recursions  for  the  ARMA  model  parameters,  inversions  of  the 
joint  2m  by  2m  prediction  error  covariances,  Rtf)  j*  and  Rrnj  are  needed.  These 
inversions  are  simplified  by  first  decomposing  them  into  upper-diagonal-lower 
form  and  then  taking  inverses.  Examples  of  the  different  forms  of  the  fast  ARMA 
model  parameter  estimation  algorithms  can  be  found  in  [11,13,13,16,17,19].  The 
numerical  properties  of  these  algorithms  are  still  under  investigation;  however, 
certain  ladder  canonical  forms  that  are  also  described  in  these  references,  appear  to 
be  most  desirable  properties  for  implementations. 


Estimating  Source  Location  from  Time  Difference  of  Arrival: 

A  Linear  Equation  Approach 

J.M.  Delosme*,  M.  Morf*  and  B.  Friedlander**, 

Abstract 

A  new  framework  is  presented  for  the  problem  of  estimating  source  location 
from  a  set  of  time  difference  of  arrival  (TDOA)  measurements.  The  proposed 
approach  is  based  on  the  fact  that  three  sensors  with  their  set  of  TDOA’s  determine  a 
straight  line-of-position  for  the  source.  It  is  shown  that  the  source  location  can  be 
found  as  the  solution  to  a  set  of  linear  equations,  with  the  number  of  equations 
being  equal  to  the  number  of  sensors.  The  proposed  approach  has  several  desirable 
features  including:  the  availability  of  efficient  recursive  algorithns  for  solving 
the  resulting  linear  equations,  the  possibility  of  distributed  implementation  of 
these  algorithms,  and  the  applicability  of  optimal  estimation  techniques  to  this 
type  of  equations.  Some  extensions  of  this  problem  formulation  to  the 
three-dimensional  case,  to  geopositioning  on  the  surface  of  a  sphere  and  to  source 
velocity  estimation,  are  also  discussed. 
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I.  Introduction 


A  common  way  of  finding  the  location  of  a  point  is  by  measuring  the  range 
differences  to  several  other  points  whose  location  is  known.  In  various  navigation 
system  (e.g.  Loran,  Decca,  Omega)  the  measurement  consists  of  observing 
differences  in  the  time-of-arrival  of  signals  from  sources  at  the  known  points  to  a 
receiver  located  at  the  unknown  point.  The  time  difference  corresponds  to  a  range 
difference  if  a  constant  velocity  of  propagation  is  assumed.  In  the  passive  sonar 
case  signals  travel  from  a  source  whose  location  is  to  be  estimated  to  sensors  with 
known  positions. 

The  standard  approach  to  estimating  source  location  is  based  on  finding 
hyperbolic  lines  of  position  (  LOP  ).  The  measurement  of  a  certain  TDOA  at  two 
sensor  locations,  determines  a  hyperbola.  Each  point  on  the  hyperbola  has  the  same 
range  difference  to  the  two  sensors  --  see  Figure  1.  If  more  than  two  sensors 
provide  TDOA  measurements,  several  LOP's  are  generated,  one  for  each  pair  of 
sensors.  Since  the  LOP's  will,  in  general,  be  different  and  since  the  source  must  lie 
on  all  of  them,  their  Intersection  will  provide  an  estimate  of  the  source  location. 


Figure  1  ;  Hyperbolic  Lines  of  Position 

The  hyperbolic  LOP  approach  has  several  serious  drawbacks:  the  analytic 
computation  of  the  intersection  location  is  very  cumbersome;  these  solutions  do  not 
appear  to  be  easily  extended  to  other  situations  such  as  calculating  source  velocity 
from  TDOA-s  and  their  rates  of  change  or  computing  source  location  from  Doppler 
measurements.  Furthermore,  the  complexity  of  the  solutions  makes  error  analysis 
very  difficult. 

A.~  alternative  approach  which  circumvents  many  of  these  difficulties  was 
proposed  by  Schmidt  [  1  ].  His  approach  is  based  on  the  idea  that  three  sensors  with 
their  set  of  TDOA-s  determine  a  straight  line  of  position.  In  fact,  that  line  Is  the 
major  axis  of  a  general  conic  which  passes  through  the  three  sensors  —  see  Figure 
2. 
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Figure  2  :  Location  on  the  Conic  Axis 

When  more  than  three  sensors  are  available,  several  straight  LOP-s  are 
generated  and  their  intersection  provides  the  estimate  of  the  source  location. 
Schmidt  [1]  proposes  to  compute  the  intersection  point  from  a  set  of  (^)*  linear 
equations,  N  being  the  number  of  sensors.  This  approach,  while  alleviating  some 
of  the  difficulties  of  the  hyperbolic  LOP  approach,  still  has  several  drawbacks.  The 
main  difficulty  lies  in  the  use  of  redundant  information:  the  formulation  leads  to  a 
set  of  (^)  equations,  while  in  a  sense  there  are  only  N  Independent  pieces  of 
information  corresponding  to  the  range  from  each  sensor  to  the  source.  This 
redundancy  creates  some  inconsistencies  in  the  case  of  noisy  measurements.  The 
straight  LOP  approach  and  the  difficulties  associated  with  it  are  described  in 
section  II. 

*  where  (^)  is  defined  as  , . . ^—7  . 


In  this  paper  we  propose  a  different  problem  formulation  which  is  based  on  the 
straight  LOP  approach.  The  source  location  is  found  by  solving  a  set  of  N  linear 
equations  which  contain  the  sensor  locations  and  the  source-sensor  ranges.  This 
set  of  equations  contains  all  the  relevant  sensor  data  without  redundancy.  We  also 
show  how  the  source-sensor  ra; ,ge  can  be  estimated  in  a  consistent  optimal  way  (in 
the  least-squares  sense)  from  the  TDOA  or  TOA  measurements.  Our  problem 
formulation  and  the  resulting  solution  algorithms  are  described  in  section  III. 

The  proposed  formulation  of  the  source  location  problem  has  a  number  of 
attractive  features,  which  are  mostly  related  to  the  linearity  of  these  equations: 

-  Efficient  algorithms  are  available  to  solve  this  type  of  equations 

-  Algorithms  which  are  recursive  in  the  number  of  sensors  have  been  derived. 

Thus,  initial  estimates  can  be  obtained  from  a  small  set  of  sensors,  and 
these  estimates  can  be  updated  as  information  from  more  sensors  becomes 
available. 

-  The  linearity  of  the  equations  makes  it  possible  to  implement  the 
computations  in  a  distributed  fashion.  This  is  particularly  important 
in  the  context  of  distributed  sensor  nets  [2], 

-  The  new  formulation  is  well  suited  to  the  development  of  optimal 


estimation  algorithms. 


-  The  proposed  approach  lends  Itself  naturally  to  extensions  to  several 
classes  of  problems,  Including:  source  location  In  three  dimensions, 
geopositioning  on  the  surface  of  a  sphere  and  estimating  target 
velocity  from  range  rate  measurements. 

These  extensions  are  presented  in  section  IV. 

We  believe  that  the  approach  proposed  in  this  paper  provides  a  proper 
framework  for  handling  the  source  location  problem.  A  detailed  evaluation  of  the 
approach  and  comparison  to  currently  used  techniques  is  now  under  way  and  will 
be  reported  later.*  We  are  also  in  the  process  of  deriving  optimal  estimation 
algorithms  for  the  source  location  problem,  as  outlined  in  section  III 


II.  The  Source  Location  Problem 


The  solution  of  the  "position  fixing"  problem  using  the  hyperbolic  LOP  has  been 
treated  quite  extensively  In  the  context  of  navigation  systems  like  Loran  [3], 
Omega  [4],  and  the  Global  Positioning  Systems  (GPS)  [5].  The  complexity  of  the 
equations  resulting  from  this  approach  led  to  a  solution  based  on  various 
approximation  techniques  involving  linearization  and  Taylor  series  expansion  [6]  - 
[8].  Since  the  hyperbolic  LOP  approach  is  well  documented  [9]  -[11],  we  will  not 
describe  it  here  In  detail.  Instead,  we  will  provide  in  this  section  a  description  and 
an  analysis  of  the  alternative  straight  LOP  approach  proposed  by  Schmidt. 

The  Straight  Line-of-Position  Approach 
For  ease  of  exposition,  attention  will  be  restricted  to  two-dimensional  position 
location  in  a  plane.  Extension  of  the  ideas  to  more  complicated  situations  will  be 
presented  in  section  IV. 

Consider  a  single  source  located  at  a  point  (xq,  ^q)  in  the  plane  and  N  sensors 
located  at  points  (x^,  y ■),  i  «  1,  ....  N,  as  depicted  in  Fig.  3.  We  shall  denote  by  r- 
the  distance  from  the  t-th  sensor  to  the  source,  and  by  A ji  the  range  difference  for 
sensors  i  and  j.  Then, 


and, 

V  ■  <  *0  -  > 2  +  bo  -  Ji ) 2 


It  is  shown  in  [  1]  that  for  any  group  of  three  sensors  {5^  5  .,  5^},  a  straight  line 


of  position  is  defined  by  the  relationship, 


Figure  3:  Problem  Geometry 


[  X.  Ajk  +  Xj  Aki  4  Xk  A.J  }  x  4  [yi  Ajk  4  y}  Aki  4  yk  A-.  ]  y  - 
J  +  aiZ  ^jk  +  ajZ  ^ki  +  akZ 

where 

3 i2  ■  xiZ  +  ?;2  • 


Schmidt  [  1]  proposes  to  use  the  set  of  linear  equations  obtained  by  writing  down 
equation  (1)  for  all  (ty  possible  sensor  triads.  His  solution  then  consists  of  solving 


this  set  of  equations  to  find  the  common  Intersection  point  (x,  y)  which  in  the 
noiseless  case  will  be  exactly  the  source  location  (xg,  ^q).  Some  alternative  solution 
methods  are  also  suggested,  based  on  the  observation  that  the  source  lies  on  one  of 
the  foci  of  an  ellipse  passing  through  the  three  sensors.  Here,  however,  we  will 
concentrate  on  the  set  of  (^)  linear  equations  represented  by  (1).  Our  aim  is  to 
explain  some  of  the  difficulties  associated  with  Scnmidt's  proposed  solution. 


The  Noise  Free  Case 

We  first  note  that  when  the  range  differences  are  obtained  without  errors, 
Schmidt's  set  of  (^)  equations  is  really  equivalent  to  a  related  set  of  only  N 
equations.  To  see  this  note  that 

A«j  Ajk  *ki  '  <rj  ~  ri^rk  ~  rj){ri  ~  rk} 

‘  -  riZ &jk  ~  r/  Aki  "  rk 2  Aij  (2) 

Therefore,  equation  (1)  may  be  rewritten  as: 


Using  this  operator,  the  original  set  of  equations  satisfied  by  the  source  coordinates 
(xq,  }q)  may  be  rewritten  more  compactly  as: 

°ijk  [x  *0]  +  °ijk  •  °i j k  t  ^  ^ 1 2  ] 


for  all  triads  ijk. 


By  linearity  of  the  operator  this  equation  is  equivalent  to: 

^ijk  £  *  *0  +  y  yo  ~  ^  ~  ^2  ]  -  0  for  all  triads  (6) 

Since  the  last  equation  holds  for  every  triad,  x  xQ  +  y  yg  -  (az  -  rz)/2  belongs  to  the 

kernel  of  O  ,  i.e,  there  exist  two  constants  b  and  c  such  that 

xixQ  +  ?i?o  "  (af2  -  rj2)/2  ■  c  +  b  r ■  t-  \  ,N  .  (7) 

Conversely  the  set  of  N  linear  equations  with  four  unknowns  ^g,  b,  c 

xi  x0  +  >i  >0  ~  ri  b  ~  c  "  (ai2"rt2)/2  i  m  l,  N  (8) 

gives,  by  application  of  the  operator  O,  the  original  set  of  (^)  equations.  Therefore 

the  two  sets  of  equations  ( 1 )  and  (8)  are  equivalent. 

Redundancy  for  Noisy  Measurements 

In  reality  the  range  difference  measurements  are  corrupted  by  noise.  To  see 
how  this  modifies  our  previous  conclusion,  more  insight  into  the  nature  of  the 
noise  is  needed.  In  particular,  we  have  to  distinguish  between  two  ways  in  which 
the  TDOA  estimates  may  be  obtained: 

(1)  Differencing  Epoch  or  Travel  Times 

If  the  source  emits  narrow,  well  defined  pulses  of  energy,  the  sensor  will  be 
able  to  determine  the  exact  time  those  pulses  were  received.  Since  the  sensors  do 
not  know  the  time  at  which  the  pulse  was  transmitted,  they  are  unable  to  compute 
the  travel  time  directly.  However,  by  differencing  the  recorded  arrival  times  of 
two  sensors,  the  difference  in  travel  times  (TDOA)  is  found,  since  the  unknown 
"starting  time"  cancels  out. 
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(ii)  Cross  Correlation 


When  the  sensor  emits  a  noise-like  continuous  signal  which  has  no  easily 
recognized  features,  the  TDOA  is  usually  estimated  by  cross  correlating  the  signals 
of  the  two  sensors.  See,  e.g.  [12]  for  correlation  techniques  and  [13]  for  an  ARMA 
modeling  approach. 

For  both  methods  the  error  in  the  TDOA  estimate,  or  equivalently,  in  th  range 
difference  A,  may  be  considered  as  additive  and  (approximately)  uncorrelated  with 
A.  While  in  the  first  case  this  noise  is  the  sum  of  two  independent  components, 
one  due  to  each  sensor,  in  the  second  case  such  a  decomposition  does  not  generally 
exist. 

Redundancy  for  Case  (i) 

More  precisely  for  type  (i)  techniques  the  available  measurements  have  the  form 

-v  ^  —  — 

r('  +  where  r,-  =  ri  +  r,  and  r  being  an  unknown  quantity  identical  for  all  the 
sensors  Sj,  wheras  v ,•  is  the  additive  noise.  Therefore  the  measured  range 
differences  are  obtained  as: 

A -  (  r.  +  V-  )  -  (  r j  +  V j  )  .  (9) 

It  is  easy  to  check  that  equations  (2)  through  (6)  still  hold  if  r.  is  replaced  by  the 
measurement  r.  +  Consequently,  when  A ••  is  defined  by  (9),  the  set  of  N 
equations 

*0  +  >0  -  (  r.  +  p.  )  b  -  c  -  (a-2  -  (r.  +  v^2)/ 2  ,  f  -  I ,  N  (10) 

is  equivalent  to  the  original  set  of  equations. 


Hence,  if  the  TDOA's  are  obtained  by  differencing  of  travel  times,  the  original  set 
of  equations  contains  highly  redundant  information  and  can  be  replaced  by  a  much 
smaller  set  of  equations  (10). 


Inconsistency  or  Redundancy  for  Case  (ii) 

For  type  (ii)  techniques  the  range  differences  have  the  formt 


A  ji  -  C  r£  -  ry  3  +  Vji  ,  (11) 

where  the  additive  noise  cannot  in  general  by  decomposed  as  F*  -  Vj  .  Some 
information  about  the  noise  on  the  A's  appearing  in  equation  (*  is  readily 
available  by  summing  up  around  the  loop  ijk  :  A j>  +  A  +  A^  «  £  ,  instead  of 
Eero  in  absence  of  noise.  This  information  should  be  used  to  improve  the 
coefficients  of  ( 1 ).  One  natural  way  to  do  this,  called  TDOA  averaging  in  [  1  ],  is  to 
correct  the  A's  symmetrically! 


A  ij 

&jk  ~ 
&ki  ~ 


I 

3 

S 

3 

I 

3 


(12) 


so  that 

+  &jk  +  ^ki  ’  0  • 

and  use  those  "corrected"  range  differences  in  the  computation  of  the  coefficients 
of  the  straight  LOP  ( 1)  .  Application  of  the  same  procedure  to  the  triad  {  S,*,  Sy,  Sj} 
yields: 
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(13) 


where  2'  -  A^-  +  A -j  +  Aj(-  .  Thus  two  expressions  for  A—,  which  are  in  general 
inconsistent,  have  been  obtained. 

The  fact  that  the  TDOA  averaging  procedure  yields  different  estimates  of  the 
same  range  difference  is  linked  to  its  inherent  suboptimality;  it  is  possible  to  find  a 
better  estimator  of  the  range  differences  and,  simultaneously,  to  get  rid  of  any 
inconsistency  by  using  all  the  information,  available  about  the  noise  on  the  A’s  to 
find  the  A's  . 

Since  the  exact  values  of  the  range  differences  have  the  property  2  A  «  0  ,  it  is 
natural  to  require  that  2  A  -  0  for  any  closed  path  or  loop  (involving  three 
sensors  or  more).  This  condition  is  equivalent  to  stating  the  existence  of  a  set  of 

-V  /N 

variables  {r^,  i  =1 ,  N]  such  that  A  -  r-.  Furthermore  the  inconsistencies  will 
automatically  disappear  in  a  procedure  which  evaluates  the  A's  as  A;i  -  r.  -  r- 

where  the  (^)  measurements  of  A  are  used  to  estimate  the  r's.  The  rule  for 
obtaining  the  set  {r^,  t  «  I ,  N  }  which  minimizes  the  variance  of  the  error  between 
the  estimated  A's  and  the  true  range  differences  will  be  called  here  "generalized 
TDOA  averaging"  and  will  be  described  in  the  next  section. 

The  coefficients  of  the  original  set  of  equations  should  be  computed  in  terms  of 
the  consistent  set  of  A's  which  is  the  best  estimate  of  the  set  of  true  range 


differences.  Since  A ^  -  ri  -  ry,  the  original  (^)  equations  are  equivalent  to  the  N 
equations: 

xi  x0  +  "  ri b  ~  c  "  (fli2-ri2)/2  .  i-  1,/V  (14) 

Therefore,  the  original  set  of  equations  contains  redundant  information  when  the 
range  differences  are  obtained  by  type  (ii)  techniques. 

These  two  examples  clearly  display  the  importance  of  using  the  proper  set  of 
equations  when  only  imprecise  measurements  are  available.  In  the  next  section  we 


will  look  more  closely  at  our  proposed  framework 


III.  The  Linear  Equation  Approach 


As  a  byproduct  of  the  analysis  of  the  original  set  of  (^)  linear  equations  in  x0,  }0, 
the  source  coordinates,  a  set  of  N  linear  equations  satisfied  by  Xq,  y^  and  two  other 
variables  b  and  c  was  obtained: 

xt  x0  +  ?0  "  ri  b  ~  c  *  (“i2~ri2)l 2  .  1-  I,  A/  (15) 

where  the  r.'s  are  estimates  of  the  ranges  (obtained  modulo  an  additive  term  r 
since  the  original  data  are  range  differences.)  This  set  of  equations  can  be  derived 
more  directly  using  the  fact  that  the  source  is  located  at  the  intersection  of  the  N 
circles  (S^,  r^: 

(  -  x0  )  2  +  (  yt  -  y0  )  2  ■=  r-2  -  (  r  +  r-  )  2  i  -  1 ,  N  , 


or 


-2  xi*0  ~  2  yty0  +  (  *02  +  yQ2 )  *  (  xf  +  >,-2  ) 


+  2  r  r . 


Thus, 

xi  x0  +  *0  +  '  +  Cr  ?  -  (  *02  +  >02  W2 

-  ((  x;2  +  j.2  )  -  r.2  )/2  ,  t  -  !,/V  (16) 


which  can  be  written  in  matrix  form  as 


*1  Jl  "1  1 

(a,2  -  r,2)/2 

17  . 

■  */V  ?/V  r,v  •- 

(“,V!  -  Ov2>/z  - 

where 

-  [x0.  j0.  r.  (  r  2  -  aQ2  )/2  ] 


(17a) 


(17b) 


and  T  denotes  the  transpose  of  a  vector  or  matrix. 


This  approach  can  easily  be  extended  to  the  location  problems  in  the  3-D 
Euclidean  space  and  on  the  sphere  (see  section  IV).  Moreover  this  derivation  also 
provides  the  interpretation  of  the  parameters  b  and  c: 

b  »  -  r 

c  =  -  (  r  2  -  (  x02  +  >o2  »/2  ' 

Although  r  is  unknown,  its  expression  in  terms  of  the  ranges  r,  can  be  fixed  if  a 

A* 

constraint  on  the  sum  of  the  r-'s  is  Imposed.  For  example, 

N 

2  r.  -  0  (18) 

i=l 

implies  that 

i  N 

r  -  jv  2  ri  -  (19) 

i.e.  r  is  the  average  range  from  the  source  to  the  sensors  when  the  set  {  r-,  i  -  1 ,  N  } 
is  "normalized"  so  that  ( 18)  holds. 

The  distinction  has  been  made  earlier  between  two  types  of  techniques  for 
obtaining  range  difference  measurements.  In  the  first  case  the  range  estimates  r. 
are  measured  and  can,  therefore,  be  used  in  equation  (17).  In  the  second  case  the 
range  differences  A-  are  measured  and  r.  is  .iot  directly  available.  In  the  latter 
case  however  a  generalized  TDOA  averaging  procedure  can  be  used  to  estimate  the 

**  >v  — 

r's  from  the  differences  A^  so  that  the  set  {r.  -  r.  -  r,  i  -  I,  /V}  is  actually 
accessible  in  both  cases.  This  method  is  described  next. 
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Generalized  TDOA  Averaging 


Given  noisy  measurements  Ay-  -  [r^  -  ry]  +  v^,  many  procedures  may  be  devised 
for  estimating  the  set  {rf,  i  -  1,  N)  described  above.  The  method  proposed  here  is 
basically  a  least-squares  solution,  hence  some  preprocessing  has  to  be  done  on  the 
A's  to  descard  too  noisy  measurements  ("outliers”)  or  separate  echoes  coming  from 
different  sources.  This  preprocessing  may  be  performed  considering  the  A's  alone 
by  computing  the  sums  of  measured  A’s  around  several  loops  defined  by  the  sensors 
and  using  those  quantities  in  a  hypothesis  testing  scheme.  Alternatively,  more 
sophisticated  TDOA  estimation  techniques  can  be  used  which  provide  an  "optimal" 
set  of  Ayf.  See,  for  example,  [  1 2]  and  [  1 3]. 

Writing  the  relation  ( 1 1)  for  all  the  pairs  of  stations  yields 


Letting 


we  can  write 


A  -  Pr  +  v 


1  T  -  [  I.  I.  ...  I], 


r  .  r  -  1  r  . 


Note  that  if  r  satisfies  (20),  then  so  does  r.  However  r  will  be  uniquely 


determined  given  the  information 


A  -  Pr  +  v  , 


if  the  constraint  ( 18)  is  Introduced,  i.e. 


1T  r  .  0 


(21a) 


(21b) 


If  the  a  priori  Information  on  the  mean  of  the  noise  is  subtracted  from  the 
measured  A's,  v  can  be  considered  as  zero-mean  with  covariance  matrix  R. 
Making  the  reasonable  assumption  that  the  error  v  is  independent  of  r,  the  linear 
least-squares  estimate  of  r  can  be  found  quite  easily.  The  procedure  for  finding  the 
estimate  consists  of  "whitening"  the  noise  vector  V  and  applying  a  least-squares 
algorithm  under  the  constraint  represented  by  equation  (18).  The  details  of  a 
sample  estimation  algorithm  are  presented  in  Appendix  A. 


If  we  further  assume  that  the  errors  associated  with  different  TDOA's  are 
independent,  in  which  case  the  error  covariance  R  =  I ,  the  least-squares  solution  of 
(21)  has  a  particularly  simple  form.  The  solution  is  given  by 


pr  P  r 


PT  A  , 


that  is 


•Vv.v.v 


%  ,%  •  - 


m 


w 


VV. 


(  Nl  -  IV  )  r  . 


S 

jt>l 


L  AjN  J 


(22) 


Using  the  constraint  ( 18),  the  linear  least-squares  estimate  of  r  is,  therefore,  given 
by 


N 


i  •  \  ,  N  .  (23) 

which  corresponds  to  simply  averaging  all  of  the  A ji  measurements  associated 

A  V  V  M 

with  each  station  St.  The  estimates  of  the  range  differences  are  -  r.  with  r. 

given  by  (23);  their  expressions  reduce  to  (12)  when  N  -  3.  Thus,  equation  (23) 
indeed  represents  a  generalization  of  the  TDOA  averaging  method. 


Solution  Methods 

Assuming  now  that  the  r.'s  have  been  computed,  the  set  of  equations  (17)  may  be 
used  to  obtain  an  estimate  of  the  source  location.  The  geometric  configuration  of 
the  stations  and  the  source  may  be  such  that  the  accurate  determination  of  the 
source  locations  is  impossible  whatever  estimation  technique  is  used.  Equation 
(17)  and  its  extensions  to  more  complicated  situations  are  particularly  well  suited 
for  displaying  such  cases  (see  Appendix  B). 


When  the  source  is  locatable  from  the  TDOA's  and  the  measurement  noise  is 


negligible,  application  of  a  least-squares  algorithm  to  the  equation  (17)  gives  the 

vector  solution  V.  Furthermore,  if  a  sequence  of  measurements  are  available 

* 

(rather  than  a  single  set  of  {  } ),  it  is  possible  to  use  estimation  techniques  which 

take  into  account  all  measurements  to  provide  the  best  possible  estimate.  To  see 
this,  we  rewrite  equation  ( 17)  in  the  form 

yt  -  Ht  vt  , 

where  y  is  the  vector  on  the  right-hand  side  of  (17a)  and  H  is  the  matrix  on  the 
left.  The  subscript  I  indicates  that  these  quantities  correspond  to  measurements 
taken  at  time  t.  Assume  now  that  a  sequence  of  measurements  of  {  r- }  is  available 

for  t  -  0,  I . In  estimation  terminology  we  have  a  sequence  of  "measurement 

vectors"  yt  and  "measurement  matrices"  H{,  and  we  want  to  estimate  the  unknown 
"parameter  vector"  V(. 

Problems  of  this  type  have  been  treated  extensively  in  estimation  literature, 
both  for  the  case  where  V,  is  a  constant  vector  (fixed  sources)  and  for  the  case 
where  tf,  is  time  varying  (moving  sources).  A  number  of  efficient  recursive 
algorithms  for  estimating  Vf  can  be  found  in  [14]  and  [15].  These  algorithms  can 
be  recursive  in  time,  i.e.  when  a  new  data  set  {  }(  or  equivalently  {yr  Hf} 

becomes  available,  the  estimate  of  the  source  location  V(  can  be  updated  to  give  an 
optimal  estimate  based  on  all  past  and  present  measurements!  The  algorithms  can 
also  be  recursive  in  the  number  of  stations,  i.e.  given  a  set  of  measurements  from  m 
stations,  they  provide  an  estimate  of  the  source  location  based  on  the  available  data. 
When  a  set  of  measurements  from  the  m  +  I  station  becomes  available,  the  estimate 
can  be  updated  without  recomputing  (17).  The  details  of  these  algorithms  will  be 
presented  elsewhere. 


When  the  measurements  are  noisy, 

y,  •  H,v,  *  «■ 

where  t,  represents  a  discrepancy  between  the  two  sides  of  equation  (17).  If  I,  is 
assumed  to  be  a  sequence  of  Independent  random  vectors,  optimal  estimation 
techniques  can  still  be  applied.  The  situation  here  is  somewhat  more  complex  than 
the  standard  linear  estimation  problem  because  of  the  inter-relations  between  the 
various  components  of  the  vector  V.  An  optimal  algorithm  for  solving  equation 
(17)  under  these  constraints  is  currently  under  development  and  will  be  presented 
in  a  later  paper.  The  algorithm  is  based  on  modifications  of  some  nonlinear 
estimation  algorithms  of  the  type  derived  by  Marcus  [16]  and  [17]. 


IV.  Some  Extensions 


I  * 

^  The  formulation  of  the  source  location  problem  given  in  the  previous  section  can 

\  ’  be  naturally  extended  in  several  ways.  Here  wp  discuss  extensions  to  three 

|  dimensions,  to  geopositioning  on  the  surface  of  a  sphere  and  to  estimation  of  source 

{  velocity. 


3-D  Euclidean  Space 

By  writing  that  the  same  location  is  the  intersection  of  the  N  (hyper- )circles  (S{, 
tj)  a  set  of  N  equations  satisfied  by  the  source  coordinates  and  the  average  range  can 
be  obtained  for  space  more  complicated  than  the  plan. 


i 

1 


‘a 

i 


I 


A  direct  extension  of  the  equations  obtained  in  the  2-D  case  gives, 

(  *{  -  *o  >  2  +  <  ”  *0 )  2  +  -  *0  >  2  •  (  ?+  U  >  2  .  I  •  \,N 

or 

-  2  xj  *0  -  2  Ji  J0  -  2  z0  f  ( *02  +  y02  +  *02  )  +  ( x,2  +  yf  +  z?  )  - 
r2  +  2r^  +  r2  . 


Define  aj  by 

ai 2  "  *<2  +  Ji  +  zi2  - 

Then  the  equation  above  can  be  written  as 


*1  Jl  *1  '1  1 

(*j2  -  rj2)/2 

. 

V  - 

• 

■  Jn  xNrN  1  ■ 

.  (<*/V2  "  ^v2)/2  . 

(24) 
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where 


V T  -  t  *0,  >o-  *o*  ^  <  r2  -  a02  )/2  ]  . 

All  the  comments  regarding  the  solution  of  ( 17)  are  directly  applicable  here. 


Surface  of  the  Sphere 

A  very  similar  formulation  holds  for  sources  and  sensors  located  on  the  surface 

of  a  sphere.  In  this  case  (  xt,  yt,  zi )  are  direction  cosines  (  x?  +  yf  +  zt-2  -  I  )  and 

measures  a  distance  on  the  sphere  whose  radius  is  taken  as  the  unit  of  length, 

hence  is  an  angle.  In  this  case  it  is  easy  to  see  that  the  following  equations  hold, 

— 

*0  *i  +  ^0  a>i  +  zozi  ■  mri  *  cos  r  cos  -  sin  r  sin  r{  ,  I  -  I,  A/  (25a) 
which  can  be  rewritten  in  matrix  form  as, 

«,  ~  i  r 

cos  r 
sin  r 


xi  y\  z\ 

* 0 

’  V  A*  " 

cos  Tj  -  sin  Tj 

•  .  . 

y<> 

• 

.  .  . 

.*o. 

■ 

■  XN  yN  ZN- 

A*  <V 

.cos  rj y  -  sin  rN. 

(25b) 


Velocity  Determination 

The  above  approach  may  be  used  to  determine  the  components  of  the  velocity 
vector  of  a  moving  source  when,  in  addition  to  the  TDOA's,  derivatives  of  the  range 
differences  are  available  (e.g.  from  FDOA's,  frequency  differences  of  arrival). 
Then,  applying  to  the  &ji's  the  same  generalized  averaging  procedure  as  for  the 

a*  — 

range  differences,  the  set  of  derivatives  {r|  ■  rj  -  r,  l  -  I,  N  )  may  be  estimated.  It 


may  also  happen  that  the  r ■  are  directly  available. 

Differentiation  of  the  equations  of  the  circles  (5^,  r .)  yields,  in  the  2-D  Euclidean 
case,  assuming  the  stations  are  fixed: 

2  x0  (  x{  -  x0  )  +  2  y0  ( Ji  -  yQ  )  -  2  (  r  +  rt )  (  r  +  r{ )  ,  i-  \,N 
which  can  be  written  as 

*0  (r+r,)r, 

>0 

r  J 

■  (r  +  rN)  rN  J  .  (26) 

Hence,  having  solved  (17)  for  y0,  r,  (26)  can  be  solved  for  ,  yQ  ,  r  . 
Moreover,  if  an  Initial  estimate  of  the  source  location  is  available,  it  is  theoretically 
possible,  by  integration  of  x0  andjg,  to  obtain  the  location  and  velocity  components 
of  the  source  from  rate  of  range  difference  measurements  only.  We  are  currently 
Investigating  the  proper  problem  formulation  for  determining  source  location  from 
FDOA  measurements  only. 
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V.  Conclusions 


The  source  location  problem  has  many  aspects,  only  a  few  of  which  have  been 
discussed  in  the  previous  sections.  Here  we  shall  briefly  describe  two  other  facets 
of  this  problem  which  we  are  investigating: 

-  Distributed  algorithms  for  source  location.  In  some  applications  (e.g.  [2])  it  is 
important  to  perform  the  source  location  calculations  in  a  distributed  fashion. 
Each  sensor  has  An  associated  processor  which  can  perform  some  of  the 
computations,  which  are  then  transmitted  to  a  central  processor  where  the  partial 
results  are  combined  to  provide  the  global  optimal  estimate.  A  distributed 
implementation  of  the  computations  minimizes  the  communication  requirements 
between  sensors  and  leads  to  flexible  and  reliable  system  designs.  The  linearity  of 
equation  (17)  makes  such  distributed  implementation  much  easier  than  in  the  case 
of  having  a  set  of  nonlinear  equations.  A  more  detailed  explanation  of  the 
distributed  processing  aspects  of  our  ap’proach  can  be  found  in  [19]. 

-  Optimal  location  estimation.  The  process  of  finding  the  source  location 
consists  basically  of  two  steps:  first,  estimating  a  set  of  TDOA's  (A^)  or  travel 
times  (r^)  from  sensor  measurements,  using  the  TDOA  averaging  method  or  a  more 
sophisticated  estimation  scheme.  Second,  use  this  set  of  estimates  to  compute  the 
source  location  via  equation  (17)  or  some  other  approach.  A  natural  question  is 
whether  this  two-step  approach  is  optimal.  In  general  we  cannot  expect  the 
optimal  estimator  to  have  this  structure.  The  complexity  of  the  nonlinear 
equations  found  in  the  hyperbolic  LOP  approach  prevented  an  adequate  treatment 
of  this  important  issue.  The  much  simpler  structure  of  our  approach  makes  the 


problem  of  finding  an  optimal  estimator  for  the  source  coordinator  more  tractable. 
This  problem  is  currently  under  investigation,  using  a  combination  of  the 
framework  described  in  this  paper  and  a  new  approach  to  the  TDOA  estimation 
problem  described  in  [13]. 

We  are  still  in  the  process  of  evaluating  the  performance  of  the  proposed 
approach  and  comparing  it  to  alternative  techniques.  However,  we  believe  that  the 
framework  presented  here  gives  a  particularly  convenient  way  of  analyzing  and 
solving  the  source  location  problem,  and  all  the  indications  are  that  its  performance 
will  be  at  least  as  good  as  of  currently  used  techniques  in  the  low  noise  case.  In  the 
case  of  very  noisy  measurements  (low  slgnal-to-noise  ratios)  significant 
Improvements  are  expected  since  optimal  estimation  algorithms  can  be  applied, 
whereas  only  ad-hoc  and  sometimes  inconsistent  approaches  have  been  used  in  the 
past. 
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Appendix  A:  Computation  of  the  Averaged  Range  Differences 


Equations 

Given  a  set  of  range  differences 

A j.  -  ri  ~  rj  +  vji  i-\,  N  ,  ]-i+\,  N 
or,  using  matrix  notation, 

A  -  Pr  +  v  , 

where  v  is  zero-mean  with  covariance  R.  The  averaged  range  differences  are 

A**'  v  v  n  ~ 

;'i  “  r«  “  r;  •  where  r  is  the  least-squares  solution  of  "  J*  -  A  which 

a,  -v 

satisfies  the  constraint  1 T  r  -  0.  The  description  of  an  algorithm  for  finding  •*"  is 
the  object  of  this  Appendix.  For  a  more  detailed  explanation  of  least-squares 
techniques  for  solving  linear  equations  see  [15], 


Algorithm 

First  the  noise  v  is  whitened: 

R-]/2  A  -  R~lfz  P  r  +  w  ,  (Al) 

where  tv  -  fH/2  v  is  white.  Then  introduction  of  the  constraint  on  r  yields  the 
system 


V 

r  . 

0 

fl-1/2  p  _ 

.  R  A  . 

(A2) 


where  the  first  equation  must  be  strictly  satisfied  and  the  other  ones  only  in  the 
least-squares  sense. 


This  system  is  equivalent  to 


where 


s  • 


b  • 


-a/,/2  o  ...  o 

JR-l/2  p  H 
Hr 

0 

R- 1/2  A 


(I +  (A/(A/-l))/2)  x  A/ 


W  x  I 


(I  +  (N(A/-!))/2)x  I 


and  H  is  the  Householder  reflection  defined  by  the  vector  1 


H  -  IN.  i-  uuT, 

/v  Uj 


where 


uT  -  [  |  +  A/'1/2,  A/'J/2 . A/-1/2  ]  I  x  A7 


The  A/  -  1  last  columns  of  A  define  a  Householder  transformation  Q  ,  a  product  of 
several  Householder  reflections,  such  that 


QA 


-A7»/2 

o  ...  o' 

17 

q 

o 

where  U  is  (A/-I)  x  (A/-I)  and  upper  triangular,  and 

Qb  .10,  p,’,  p2'  , 

with  Pj  :  (A/-I)  x  I  and  p2  :  ((A/-lXA/-2))/2  x  1  .  The  least-squares  solution  of  (A3) 
is  then  readily  obtained  as? 

jj  -  0  (a  constraint), 

and  the  other  components  of  s  are  found  by  back-substitution  from 
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The  components  of  the  least-squares  solution  r  -  H  s  are: 


and 

-  f.  -  «T  (  I  +  N V2  )  -1  ,  i  -  2,  N  (A5) 

with 

N 

a  -  N~1/2  ^  si  • 
i-2 

The  corresponding  minimum-squared  error  is  given  by: 

m.s.e.  -  p2T  p2  (A6) 

Computational  Complexity 

The  quantities  Q,  B _1/2  and,  U  may  be  precomputed  so  that,  when  no 
measurement  is  rejected,  the  computations  reduce  to: 

i)  multiply  Q  ft  I2  and  A  to  obtain  Pj  ( (N(N-l)2)/2  multiplications ) 

ii)  solve  (A4)  ( (N(N-l))/2)  multiplications  and  (N-l)  divisions  ) 

ill)  get  the  r^'s  from  (A5) 

lv)  optional: 

compute  P2  ( (N(N-l)2(N-2))/4  multiplications  ) 

apply  (A6)  to  evaluate  the  m.s.e.  ( ((N-l)(N-l))/2  multiplications  ). 

In  summary,  (N2(N-l))/2  multiplications  and  (N-l)  divisions  are  needed  for  r  , 


plus  approximately  N*/4  multiplications  for  the  m.s.e.  (If  R  -  /,  then 
P2t  p2  -  AT  A  -  p,T  p,  and  the  m.s.e.  is  evaluated  in  ((N+2)(N-l))/2 
multiplications) 

When  some  measurements  are  rejected,  more  computations  are  involved  because 
the  precomputed  matrices  cannot  be  used  with  only  simple  transformations.  For 
example,  the  inverse  square-root  of  the  reduced  covariance  matrix  cannot,  in 
general,  be  directly  obtained  from  the  precomputed  R  "*/2  by  simply  crossing  out 


some  rows  and  columns. 


Appendix  B:  Conditioning  of  the  Source  Location  Problem  (Locatability) 


According  to  the  type  of  measurement  used  to  estimate  position  (e.g.  range 
measurement  or  bearing  angle  measurement),  there  exist  some  particular 
configurations  of  the  source  and  the  stations  for  which  the  estimated  source 
location  is  very  sensitive  to  measurement  errors.  This  phenomenon,  known  as 
"geometric  dilution  of  precision"  [8]  -  [10],  also  appears  when  range  differences 
are  measured;  the  equations  derived  in  sections  III  and  IV  prove  particularly 
helpful  to  determine  the  situations  for  which  it  occurs  for  source  location  in  the 
2-D  and  3-D  Euclidian  spaces  and  on  the  surface  of  the  sphere. 


The  set  of  equations  (17)  will  have  an  infinite  number  of  solutions  if  the  rank 
of  the  matrix 


*1  yi  rl  1 


XN  JN  rN  1 


is  less  than  4.  This  is  equivalent  to  saying  that  the  rank  of  the  matrix 


*1  ?1  rl  1 


L  XN  rN  1 

At  mm 

is  less  than  4,  since  -  r •  +  r  .  This  means  that  there  exists  ( a,b,c,d )  *  0  such 
that  for  every  receiver  the  triple  (x,  y,  r)  satisfies 

ax+by+cr+d  -  0.  (Bl) 


,*•  . 


V>YwVw\ 


Consider  now  several  special  casest 


1)  If  C  ■  0  i  all  the  receivers  are  on  the  same  line  a  x  +  b  y  ■  -d  . 
Assume  now  that  c  *  0  .  Letting 

D-d  +  ax0  +  6y0  , 

(Bl)  becomes 

a(x-x0)  +  b(y-y0)  +  cr+  D  -  0 


iii)  If  D  -  0  all  the  stations  are  on  the  part  defined  by  the  set  of  coordinates 
{ (x,  y)-.  sign  l  a  (  x  -  x0)  +  b  (  y  -  y0))  -  -  sign  c  } 

of  a  general  conic  whose  center  is  the  source,  (The  conic  may  degenerate  into  two 


lines  intersecting  at  the  source  location). 


3-D  Euclidean  Space 

Clearly  if  (24)  is  considered,  there  is  degeneracy  in  the  following  situations 


-  all  the  stations  are  in  the  same  plane  (a  2-D  subspace) 

-  all  the  stations  are  on  a  general  quadric  surface  with  the  source  as  one  focus 

-  all  the  stations  are  on  a  well  defined  part  of  a  general  quadratic 
(which  may  degenerate  into  two  planes)  whose  center  is  the  source. 


Letting 


Surface  of  the  Sphere 


Cj  y j  2j  -  cos  rj  -  sin  r, 


*N  ZN  -  cos  rN  -  sin  rN 


equation  (25)  may  be  rewritten  as 


-  0  , 


with 


*o2  +  V  +  V  -  1 

cos 2  r  +  jin2  r  -  1  (B3) 

Obviously  if  p{M)  m  5  the  system  does  not  have  any  solution  (  p  denotes  the  matrix 


.%V 


rank.). 


If  p(M)  -  4  there  exists  a  vector 

t  m  [  x,  y,  z,  u,  o  } T  , 


such  that 


M  t  m  o  . 


However,  if 


x2  +  y2  +  z2 


u2  + 


(B3)  will  not  have  a  solution.  Now  if  M  arises  from  a  set  of  perfect  data  for  N 
stations  then  p(M)  <  5  i  if  p{M)  -  4  then  (B3)  has  two  solutions  and  only  two,  one 
being  the  opposite  of  the  other. 


Degeneracy  arises  only  if  p(M)  <  4.  Thus  conditions  for  degeneracy  are  obtained 
by  determining  the  configurations  for  which  the  source  and  N  i  4  stations  give  an 
M  such  that  p(M)  .  3.  Let 

u  -  cos  r 

and 


v  -  sin  r  , 

then  p(M)  -  3  is  equivalent  to  the  existence  of  two  5-tuples  ( a,b,c,d,e ). 
(a',  b',  c\ d',  tr)  such  that: 

ax  +  by  +  ci  +du+  tv  •  0 

a'x  +  b'y  +  c'z  +  rf'u  +  t'v  -  0  (B4) 


If 


dct 


d  t 


d’ 


*  0 


then 


u 

- 

A. 

_ 1 

-l 

a  b  c 

.  y . 

I,-  A 

.a'b'  c'. 

ABC 
A'B '  C'J 


LzJ 


(B5) 


Hence  u  can  be  interpreted  as  the  cosine  of  the  distance  to  the  point  (A,  B,C)  and  v  as 
the  cosine  of  the  distance  to  the  point  (A \  B C ').  Thus  (  I  -  y2  )*' 2  is  also  the  cosine 
of  the  distance  to  the  great  circle  whose  polar  point  is  ( A B'.C') .  But  (  1  -  y2  )*/2  is 
u,  therefore  the  two  conditions  (B5)  are  equivalent  to  saying  that  the  distance  to 
the  point  ( A,B,C )  is  equal  to  the  distance  to  the  great  circle  with  polar  point 
(A',  B'.C')  and  that  is  the  definition  of  some  type  of  spherical  conic  .  Moreover, 
since  u  -  cos  r,  the  source  location  is  at  the  point  (A,  B,C),  one  focus  of  the  conic. 

There  are  still  some  other  cases  of  degeneracy,  not  given  here. 


Domain  of  Unlocatability 

When  the  configuration  of  the  stations  and  source  is  close  to  degeneracy  the 
solution  is  very  sensitive  to  the  noise  in  the  data  whatever  the  procedure  used  to 
find  this  solution.  The  determination  of  the  domain  of  unlocatability  of  the  source 
given  a  configuration  of  the  stations  is  therefore  of  basic  interest. 


Considering  the  problem  of  location  in  the  plane,  if  the  N  stations  are  on  a 
straight  line  the  domain  of  unlocatability  is  the  whole  plane.  If  N  -  4  and  the 
stations  are  not  aligned  then  there  is  a  pencil  of  conics  going  through  the  stations 
and  part  of  the  unlocatability  domain  is  the  locus  of  the  two  foci  of  these  conics.  If 


A/  -  5  there  will  exist  in  general  a  conic  going  through  the  stations  locations  so  that 
the  unlocatability  domain  will  consist  of  the  two  foci  of  this  conic. 

If  N  >  5  one  may  try  to  define  which  configuration(s)  of  the  stations  optimizes 
the  conditioning  of  the  problem!  however  if  N  is  very  large  a  deterministic 
optimization  is  not  necessary.  A  random  scattering  of  the  stations  is  sufficient  since 
it  can  be  shown  (  using  the  theory  of  random  matrices  )  that  for  sensors  distributed 
at  random  and  independently  in  the  space,  the  probability  of  poor  conditioning  of 
the  equations  obtained  from  N  sensors  (  defined  as  the  sum  of  the  distances  of  the 
sensors  to  the  closest  configuration  of  degeneracy  being  less  than  some  threshold  ) 
follows  asymptotically  a  Poisson  law  whose  mean  is  an  increasing  function  of  the 
density  of  the  sensors  and  of  the  threshold.  Hence  for  large  N  the  probability  of 
near  degeneracy  when  the  stations  are  randomly  distributed  is  extremely  small. 
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INTRODUCTION 

The  need  for  decentralized  computation  arises  in  many  problems  and 
applications,  such  as  power  networks  [1],  distributed  sensor  networks 
[  2  ]  and  others,  see,  e.g.  [  3  ].  Briefly  we  can  mention  that  one  of  the 
prime  motivations  is  to  try  to  keep  the  communication-load  between 
different  sites,  or  even  between  different  local  processors  down.  We 
assume  that  there  is  a  limited  communications  bandwidth,  or  that  the 
communication  costs  are  relatively  high  compared  to  computation  costs. 

In  the  case  of  a  distributed  sensor  network  (DSN)  an  enormous  amount  of 
raw  data  is  generated  by  the  sensors,  generally  distributed  over  some 
area.  The  different  sensor  sites  are  envisioned  to  be  connected  via  a 
computer  network.  Not  all  of  the  data  may  be  relevant;  furthermore, 
many  of  the  computations  with  the  data  involve  only  parameters  and  data 
related  to  one  'station'.  Therefore  such  operations  can  be  performed  by 
a  processor  near  or  at  the  sensor  site,  and  only  some  preprocessed  results 
need  to  be  communicated  to  the  central  station,  or  a  user  on  the  network. 

The  detection  and  tracking  of  low  flying  aircraft  and  cruise  missiles 
is  one  example  where  distributed  sensor  networks  are  of  interest.  The 
sensors  can  be  active  or  passive,  but  the  generated  amount  of  data  is 
large  compared  to  the  number  of  parameters  or  states  to  be  estimated. 
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When  we  have  a  completely  centralized  scheme,  i.e.  ,  all  the  data  is 


communicated  noise  free  from  the  sensors  to  one  location,  then  it  is 
well  known  that  the  minimum  error-variance  state  estimator  is  given  by 
the  Kalman-f ilter  for  linear  models. 

Other  applications  -  as  in  power  systems  -  deal  with  the  state 
estimation  of  a  large  number  of  interconnected  subsystems.  Many  attempts 
have  been  made  in  the  past  to  decentralize  the  Kalman-f ilter ,  and  both 
optimum  and  sub-optimum  algorithms  resulted  [3]. 


In  this  paper  we  shall  have  a  closer  look  at  the  possible  distribu¬ 
ted  forms  for  optimal  estimation  and  detection  problems.  The  basic 

questions  are:  What  information  should  be  exchanged?  How  should  it  be 
combined. 

Section  2  handles  the  simplest  case  of  parameter  estimation,  or 
systems  with  no  dynamics,  introducing  the  basic  tools  in  more  detail  and 
providing  the  most  insights  in  the  distributed  aspects. 

Section  3  extends  these  results  to  systems  with  dynamics,  e.g., 
tracking  problems,  for  which  an  optimal  step-by-step  update  of  the  esti¬ 
mates  is  demanded  (i.e.,  getting  real  time  estimates). 

This  requires  substantial  communication,  and  the  problem  was  treated 
by  Speyer  [4]  and  others  [5,6].  However,  the  algorithm  proposed  by  Speyer 
implies  the  knowledge  of  the  measurement  parameters  (H^,R^)  of  each 
substation  beforehand.  This  is  an  unlikely  situation  in  many  applications, 
for  instance  when  the  sensors  are  mobile  and  the  measurement  parameters 
dependent  on  the  coordinates  of  the  node.  '.v'e  present  here  an  alternative 
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algorithm  that  does  not  require  this  a  priori  knowledge  but  by  necessity 
then  results  in  a  higher  number  of  data  to  be  transmitted.  In  Section  4, 
we  shall  look  at  a  distributed  batch-processing  algorithm.  This  is  appro¬ 
priate  if  one  does  not  need  estimates  at  every  time-point.  We  give  a 
development  of  such  optimal  "block"  processing  based  on  our  scattering 
approach  [7]  in  order  to  gain  more  insights  in  the  various  possible  forms 
of  distributing  computations  and  exchanging  information,  i.e. ,  measured 
data  or  partial  estimates. 

A  brief  discussion  of  the  static  and  dynamic  detection  problems  is 
presented  in  Section  5.  Features  of  the  algorithms  are  reconfigurability 
reduced  communication  requirements  and  survivability  (i.e.,  robust,  sub- 
optimal  by  partial  breakdown).  Flexibility  from  the  user’s  standpoint 
can  be  incorporated  in  a  hierarchical  system  design  (local  regions,  area 


coordinates,  etc.). 


-vrv  V  P  "V  r%'T  "v1  »  "  T*  "V  1  - 


2.  PARAMETER  ESTIMATION 

This  discussion  is  focused  on  algorithms  that  lend  themselves 
naturally  to  distributed  processing.  We  shall  first  consider  the  static 
estimation  problem,  since  it  leads  to  the  simplest  example  of  a  distri¬ 
buted  estimation  algorithm,  and  will  therefore  give  the  most  insight  into  the 
problem  of  distributed  processing.  These  results  will  be  used  in  Section  3. 

In  this  problem,  x  is  an  n-dimensional  vector  of  unknown  para¬ 
meters,  which  we  would  like  to  estimate .  We  will  assume  that  all  the 
information  that  we  have  about  x  are  some  (noise-corrupted)  measurements, 
and  some  a  priori  information.  Suppose  that  the  uncertainty  in  this  a 
priori  is  specified  in  terms  of  the  second  moment  of  x.  The  prior 
information  about  x  is  then  given  by 


E[x]  =  x 

o 

E[5o^]  =  n  (2.2) 

where  E  denotes  expected  value.  In  the  rest  of  this  paper,  we  assume 
for  simplicity  that  xq  =  0.  The  measurements  are  situated  at  disjoint 
locations  N^,  i  =  l,...,r.  This  may  be  disjoint  in  space  (disjoint 
sensors  )  as  well  as  in  time  (as  is  the  case  in  batch  processing  of  repeated 
measurements),  or  a  combination  of  the  two.  We  do  not  have  to  discrimin¬ 
ate  between  space  and  time  here,  because  in  this  problem,  no  dynamics  are 
involved,  and  therefore,  there  is  no  time  evolution.  We  shall  refer  to 
N  .  as  'station  i ' . 

i 

Let  N.  have  n.  measurements,  modeled  as: 
i 


y(l)  =  H(l)x  +  v(l)  ;  i=l,...,r  (2.3) 

where  is  a  p^xn  matrix,  denoting  an  'observation  matrix'  that 
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depends  on  the  node  (or  station)  N^.  The  term  represents  the 

measurement  error,  and  is  modeled  as  an  'observation  noise'.  We  shall 

assume  that  the  p.  measurements  at  N.  are  unbiased,  i.e., 

1  1 

E  v(i)  =  0,  and  that  further  the  observations  are  uncorrelated,  and 

'reduced'  (i.e.,  have  unit  variance)  E  v^v^"  =  I.  These  assumptions 

entail  no  loss  of  generalization,  since  we  can  always  preprocess  the 

data  for  an  arbitrary  measurement  error  variance,  provided  that  this 

variance  is  nonsingular.  Indeed,  let  in  general  E  v^v^'  =  ,  then 

-1/2 

z.  =  R.  y.  are  the  'reduced'  observations,  because  then: 
l  v  J  \ 


:.  =  (r^1/2  H(i))x  +  r:*'"  vv^  =  HVi'  x  +  vv^  (2.4) 


-1/2  „(i)  A  -(i)  _  ^  -(i) 

h 

V 


where 


E  v(l)  =  R  1/2  E  v(l)  =  0 

v 


— ( i) — ( i) >  -1/ 2  _  (i)  (i)  -T/2 

E  v  v  =  R  E  v  v  R  =  I 

v  v 


(2.5) 

(2.6) 


It  is  clear  that  this  preprocessing  involves  only  one  station,  and  there¬ 
fore  can  be  done  locally,  without  any  communication  with  other  locations. 
Also  v(i;>  is  assumed  to  be  uncorrelated  with  the  parameter  x,  and 
all  measurements  performed  at  stations  ,  j  ^  i. 

With  respect  to  a  spatial  distribution  of  the  N_^,  it  means  that 
the  measurement  errors  at  one  station  are  uncorrelated  with  the  errors 
at  another  station,  in  most  cases,  a  quite  reasonable  assumption.  If 
we  consider  temporal  distribution  (batch  processing)  it  means  that  the 
measurement  errors  in  one  batch  are  uncorrelated  with  those  in  previous 
batches.  For  the  moment,  we  will  leave  the  idea  of  distributed  organi¬ 


zation,  and  present  the  optimal  solution  without  decentralized  consideration 


Define: 


,  H  = 


(2.7) 


so  that  the  'global  observation'  equation  becomes: 


y  =  Hx  +  v 


(2.8) 


It  is  now  well  known  (e.g.,  see  [8,9])  that  the  linear-least-mean-square 
estimate  of  x  given  y  is  given  by: 


x  =  (E  xy') (E  yy')  y 


(2.9) 


By  direct  verification,  it  is  clear  that: 


=  H  II  IT  +  I 


E  xy'  - 


=  n  H'(H  H  H'  +  I)'x  y 


(2.10) 


After  combining  the  a  priori  with  the  observations,  the  remaining 
uncertainty  in  x  is  given  by: 

P  =  E(x-x) (x-x)7 
which  after  some  algebra  yields: 


p  =  n  -  n  h'(h  nr  +  d-1  hit 


(2. 11) 


Using  the  matrix  identity 


(A+BCD)"1  =  A_1-a"1B(c'1+DA'1B)"1DA“1 


we  obtain  (2.10)  and  (2,11) 


-1  .  ~  u(i).u(i)\-l  -  u(i)>  (i) 


x  =  U  +  L  H  H 


(2.12) 


p  =  hr1,  -  h^-h^V1 


(2.13) 


\  •-  V  ^ 


»  r  ■/„  -r  ,  • \  A  *  .  -  *  ’ 

•_  .  _  »  » _  t  *  *  »  *  >  j"  •  • 


.  v  \- 


*  V  *  ,  *  V  *1  »  *  M  *,  *  * 
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We  can  now  compare  the  use  of  formulas  (2.10)  and  (2.11)  with  those  of 
(2.12)  and  (2.13). 

In  case  the  observation  matrices  are  known  a  priori  by  the  central 
processor,  then  each  station  only  needs  to  send  the  observed  data,  or  its 
equivalent.  Under  the  assumed  knowledge  of  H,  the  posterior  error 
covariance  P  can  be  precomputed  by  either  (2.11)  or  (2.12).  The  story 
is  different  for  the  estimates.  Either  each  node  transmits  its  p^  data 
to  the  central  station,  which  combines  these  according  to  the  formula  (2.10) 
or  each  station  precomputes  a  "partial  estimate"  z^  =  (or 

PH^'y^  in  case  all  nodes  have  a  "copy"  of  the  observation  model 

(2.8)). 

V  A 

The  first  method  results  in  L  p  =  p*r  data  transmissions  (p  is 

i=l 

the  average  number  of  observations  per  station)  while  the  decentralized 

computation  scheme  requires  only  nr  data  transmissions.  Savings  in 

communications  are  thus  substantial  in  a  many-observation  situation  (i.e., 

p  >  n) .  This  is  even  more  so  if  the  observation  parameters  from  each 

node  have  to  be  communicated.  As  indicated  by  the  summation  over  i,  the 

formulas  (2.12)  and  (2,13)  lead  naturally  to  a  distributed  scheme. 

Now,  next  to  computing  the  z^\  each  node  also  forms  the 

matrix  =  H^'H^  of  dimension  n  x  n.  Then  and  are 

transmitted  to  the  central  processing  station.  Note  that  because  of  the 

symmetry  of  L^,  there  are  only  independent  quantities.  Thus 

2  ^ 

node  N.  transmits  n  +  =  -^r-  +  4  n  numbers  to  the  central  pro- 

l  2  2  2 

cessor  instead  of  p.  (from  the  data)  and  np ,  from  H^.  The  central 

l  i 

processor  then  adds  the  to  .1  \  computes  this  inverse,  and 

post-multiplies  by  the  sum  of  the 


7 


In  the  case  of  a  spatially  distributed  system,  we  have  savings  in 
data  transmission  with  distributed  processing  (versus  centralized)  if 
which  can  be  simplified  to  p^  _>  y  +  1  (since  n  and 
p .  are  integers) . 

1  2 

n  +3n 

The  total  over  all  r  stations  involves:  r  — ^ -  transmissions 

r 

for  the  distributed  scheme,  versus  £  (n+D(p.  =  (n+l)p  for  the  trans¬ 
it 

mission  of  the  data  to  a  central  processing  unit.  Again  the  trade-off 
2 

is:  r  n  -~-n  £  (n+l)p  or  roughly:  —>?}■+  1.  If  we  consider  the  batch 

2  r  2 

processing,  the  above  holds  if  we  substitute  'number  of  transmissions' 
by  'number  of  memory  locations.*  Of  course,  computational  costs  and  trans 
mission  costs  may  be  weighted  differently,  so  that  the  trade-off  between 
communication  load  and  the  computational  burden  at  a  station  may  be 
expected  at  another  value  of  p/r. 

There  are  very  many  details  that  it  will  be  necessary  to  deal  with 
in  any  actual  implementation  of  an  algorithm  such  as  the  one  described. 

To  mention  a  few  of  the  possible  refinements,  we  can,  instead  of  having 
all  r  stations  transmitting  to  the  central  processor,  have  a  tree-type 


communication  network,  e.g., 


or,  even  simpler,  a  'linear ' structure 


'  ®r 


In  the  latter,  all  stations  compute  their  z^  and  L^,  but  now, 
station  waits  for  transmission  until  it  receives  and 


and  computes 


7(2>  .  Z(D  +  z  (2) 

L(2)  -  L(1)  +  L(2) 


—(2)  —(2)  (1)  (2) 
Now  it  sends  z  and  L  across  to  (3)  instead  of  z  ,z  , 

(1)  (2) 

Lv  ,  Lv‘‘  ,  thus  'compressing'  the  data.  Clearly,  there  are  again 


r  n(n+3) 


transmissions  needed,  but  depending  on  the  sequencing,  the 


distances  involved  might  be  shorter,  and  also  more  processing  goes  on  at 
the  nodes  2  to  r,  and  less  at  the  central  coordinator.  In  case  of  a 
temporal  distribution,  this  is  even  clearer.  From  the  first  batch  z^ 
and  are  computed  and  put  into  memory  (the  line  in  the  figure  denotes 

now  temporal  evolution  rather  than  geographical  routing).  When  the  second 

(2)  (2) 

batch  is  processed,  the  z  and  L  are  added  to  the  contents  of 
the  memory,  and  therefore  can  be  stored  in  the  same  memory  unit.  Appen¬ 
dix  A  describes  a  further  refinement  for  the  computation  and  possible 


reduction  in  the  transmission  of  Lv  using  square-root  methods. 
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THE  DYNAMIC  ESTIMATION  PROBLEM;  REAL  TIME  UPDATES 
3.1  Problem  Formulation 

We  consider  here  the  state  estimation  of  a  linear  dynamic  system. 

We  shall  only  be  concerned  with  the  discrete-time  formulation  of  the 
problem,  undoubtedly  the  most  practical  for  digital  computer  applications. 
We  assume  that  the  knowledge  about  the  initial  state  of  the  system  can  be 
modelled  with  a  Gaussian  distribution,  as  we  did  in  the  previous  section 
(the  a  priori  knowledge  of  x) .  The  inputs  to  the  system  are  supposed  to 
be  purely  random,  uncorrelated  with  the  initial  state  (x^) ,  and  are 
modelled  as  "white  noise." 

Let  the  state  model  be: 


=  Vt +  ut 


(3.1) 


with  x^.  the  unknown  n-dimensional  state  at  time  t,  under  the  assump¬ 
tions  : 


xo  -  N|>0’  noJ 

ut  ~  NL°>  Qt^ 

and 

E[u.  u'l  =  0  if  t^s 

L  t  sJ 

E[ut  x']  =  0  for  all  t 


(3.2) 

(3.3) 

(3.4) 

(3.5) 


The  nodal  observations  form  a  time-sequence,  where  at  each  time  instant 
t  we  have: 


„(i)  (i) 

Ht  xt  +  vt 


;  i  =  1 ,  .  . .  ,r 


(3.6) 


For  generality  we  consider  time  variable  observation  noise  characteristics 


_  (i)  (i) 

E  v  v 
t  s 


R^5 
t  ts 


(3.7) 


A  special  feature  of  the  decentralization  of  the  sensors  is  the  fact  that 
the  measurements  are  completely  independent  and  therefore  that  the  noise 
at  different  nodes  are  independent: 


_  (i)  (j)  n 

E  v  v  =0 
t  s 


V  t,s  if  i^j 


(3.8) 


The  last  assumption  that  we  shall  make  is  the  unco.'relatedness  of  plant- 
and  observation  noise: 


_  (i)  , 

E  v  u' 
t  s 


=  0  V  t,s;  Vi 


(3.9) 


Defining  the  quantities  y  ,  H  ,  and  v^_  for  each  time  instant  as  in  (2.7), 


then  we  get  the  time-indexed  equation  (2.8): 


yt  =  Htxt  +  vt 


(3.10) 


By  virtue  of  (3.7)  we  have  the  combined  noise  characteristics: 


E  v  v' 
t  s 


=  Blockdiagonal  . 


( r ) , 

•  -K  -(3  11) 

t  t ,  s 
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The  optimal  solution  of  this  estimation  problem  is  given  by  the  well- 
studied  Kalman  filter  (e.g.,  see  [8]).  This  gives  a  step-by-step 
up-dating  method  which  incorporates  a  "time-update"  and  a  "measurement 
up-date . " 


3.2  Decentralized  Estimation  with  Known  Observation  Parameters  H ,R 
In  the  context  of  stochastic  control,  Speyer  [4]  recently  gave  a  method 
for  a  distributed  optimal  step-by-step  solution.  Since  there  the  goal  is 
to  make  a  robust  optimal  decentralized  controller,  a  "best"  estimate  is 
necessary  at  each  node  N. .  In  case  a  node  is  isolated  from  the  network, 
a  "local"  Kalman  filter  gives  a  local  (sub)optimal  estimate,  and  this  is 
used  for  the  state  feedback  controller.  With  the  network  available,  addi¬ 
tional  outside  information  is  used  to  upgrade  the  estimate.  In  order  to 
accomplish  this,  Speyer's  algorithm  requires  each  node  to  co npute  another 
n-dimensional  vector  based  on  its  local  data.  The  local  estimate  and  the 


additional  vector  are  released  to  the  network  and  all  nodes  or  a  central 


node  that  have  access  to  these  2n-r  parameters  can  combine  these  to  find 
the  optimal  estimate  (thus  coinciding  with  the  estimate,  computed  from  a 
"global"  Kalman  filter). 

For  completeness  we  give  here  the  algorithm  in  a  slightly  modified 
fashion,  but  equivalent  to  the  original.  The  details  of  the  derivation  can 
be  found  in  [4j. 

First,  each  node  computes  the  local  estimate,  based  on  Z  =  {  y^^  ,  .  .  . ,  y^1^ } 

by  a  local  Kalman  filter: 


t+1  t 


~(i)  .  „(i),  (i)  (i)  >  (i)  s 

Vtlt-i  *  Kt  (>’t  -Ht  Mt-i’ 


(3.12) 
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E 


F  H*1*'  +  R(l))_1 

t  t  t-1  t  t  t  t-1  t  t 


(3.13) 


t+1  t 


F  P(f^  F'  +  Q„ 
t  t  t-1  t  t 


P0|-l  -  no 


K(x)  H^i)/R^i))kJ1)' 

t  t  t|  t**l  t  t  t 


(3.14) 


There  are  many  other  ways  to  write  the  equations  (3.12)  to  (3.14),  which 

is  optimal  is  not  the  issue  in  this  paper,  since  only  local  processing  gets 

involved.  The  local  estimates  of  (3.12)  are  now  combined  to  form  the 

r  >  v 

estimate  given  all  the  data  Z  =  U  Z^1  as: 

t  i=l  t 

'  ^  f  (i)  .-1  „(i)/(,.,(i),.-lu(i),,  -l-(i)  ( 

t+1 1  t  Ft  ,^1Pt|  t  QPtjt-l)  Ht  (Rt  )  Ht  )Ft  xt+l|t+ht 


where  P^i t  is  the  global  filtered  estimate  error  covariance 


(3.15) 


pt|t-i  +  p  <h[i))'(r^))"1hJi) 


(3.16) 


Pt+l| t  FtPt| tFt  +  Qt 


(3.17) 


and  h^  is  the  additional  data-dependent  n-vector  which  is  locally 


updated  from: 


s  (i)  „(i)-(i) 

**t+l  t  :t+lxt+ll 


t  ’  o 


0 


(3.18) 


where : 


\  -  ptitp 


r  r 

t  t  t  t-1  t 


(3.19) 


“  Pt|t  Pt|t-lFtPt-l|t-l(Pt-l|t-l)  Ft-l'(Pt|t-l) 


,(3.20) 


In  the  original  paper  F  was  taken  to  be  constant.  The  method  fails 

when  F  is  singular  at  some  time.  As  indicated  earlier,  only  data 

(i) 

acquired  at  node  N.  is  needed  for  the  calculation  of  x^  , ■  and 

1  t+1 | t 

h^|,  but  the  computation  of  h^^  assumes  the  knowledge  of  t  and 

Pi  a  priori,  through  the  formulas  (3.19)  and  (3.20).  From  equation 

C  j  L  **  J. 

(3.16)  this  boils  down  to  the  a  priori  knowledge  of  the  observation  process 
of  all  nodes  ! 

Another  disadvantage  of  this  method  is  the  fact  that  by  breakdown  the 
computed  best  global  estimate  is  no  longer  (sub)optimal ,  although  locally 
suboptimal  estimates  are  computed.  Indeed,  if  as  suggested  by  Speyer  the 
P^l ^  and/or  Pt|t  ^  are  precomputed  and  then  fixed,  the  global  estimate 
(3.15)  will  only  be  optimal  if  all  terms  in  the  summation  are  accounted 
for  (i.e.,  if  all  nodes  communicated  their  x£+l|t  ancl  A  modifica¬ 

tion  which  would  still  yield  suboptimal  global  estimates  will  have  to 
compute  the  in  real  time  and  based  on  the  effectively  linked  stations 

in  the  network  at  the  given  time.  This  will  again  require  transmission  of 
this  real  time  computed  global  error-covariance  to  all  nodes,  so  that  robust 
estimates  can  be  formed.  We  can  conclude  that  the  above-discussed  algorithm 
is  efficient  in  the  context  of  distributed  control,  where  it  is  the  goal  to 
obtain  fault-tolerant  local  controllers,  which  were  suboptimal  when  there 


iras  a  breakdown  in  one  or  more  of  the  links  of  the  system.  The  algorithm 
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is  communication  cost-effective  if  the  observation  parameters  are  known 
a  priori.  The  global  estimate  is  not  robust. 

3.3  Decentralized  Estimation  with  Unknown  Observation  Parameters 

If  we  pursue  a  different  goal:  an  optimal  global  estimate  which 
remains  suboptimal  when  data  from  one  or  several  nodes  is  interrupted,  and 
which  requires  the  least  amount  of  data  transmission,  the  following  algor¬ 
ithm  can  be  used.  The  development  is  along  the  same  lines  as  in  Section  2, 
although  the  results  are  believed  to  be  new. 

To  account  for  optimality  in  the  presence  of  failures,  we  will  assume 
that  the  number  of  participating  nodes  is  time  variant.  Thus  in  (3.6)  we 
set  r  =  r(t) .  Moreover,  we  will  assume  that  only  at  time  t,  the  value  of 
r(t)  is  known.  In  this  situation,  the  global  error-covariance  P  cannot 
be  precomputed  as  is  needed  in  Speyer's  method.  We  will  give  the  solution 
for  the  time  update  and  the  measurement  update  separately,  basically  because 
for  the  latter,  we  can  use  the  results  obtained  in  Section  2. 


Measurement  Update 

At  stage  t,  we  suppose  that  r(t)  stations  perform  the  measurements 
(3.6).  Previous  to  this  the  best  estimate  (globally)  of  the  state  vector 
x(t)  is  x(t|t-l)  (i.e.,  the  optimal  estimate  given  data  up  to  t-1)  with 
error-covariance  ? t [ t  —  1 ’ 

First,  we  reduce  this  problem  to  the  parameter  estimation  problem  of 
Section  2.  Introducing  "normalized"  observations: 


._(i)_-l/2  (i)  (i)^  ,  |  .  ... 

_Rt  j  (y  -Ht  x ( t | t-1) ) 


(3.21) 
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we  obtain: 


( x ( t ) —x ( t | t-1) )+[R 


(i)  -1/2  (i) 
t  ]  Vt 


(3.22) 


and  it  is  easy  to  check  that  the  a  priori  x(t| t-1)  =  x(t)-x(t| t-1)  has 
zero  mean  and  covariance  |  t  1  the  second  term  in  (3.22)  satisfies 

the  conditions  (2.5)  and  (2.6).  Defining  and  v^^  as  in  (2.4), 

the  equivalent  measurement  is: 


(i) 


x(t | t-1) 


+ 


(3.23) 


From  Section  2,  the  measurement  update  —  incorporating  the  measurements 
at  time  t  —  yields  by  definition  the  estimates  x(t|t),  and  is  given 
by  equation  (2.12) 


x(t|  t)  =  (p"f  +  rz;t)H[i)^i))”1  E>H^i)S(i) 

1  1  i=l  r  c  i=.i  x  t 


(3.24) 


or,  after  back-substitution  to  the  original  parameters: 


x( t|  t)  = 


1  i=l  1=1 


(3.25) 


a  ( i  ^ 

If,  based  on  the  (global)  estimate  x(t|t-l)  and  the  data  y  ,  we  want 
a  "local”  estimate  at  node  (i),  then  we  have: 
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X^  (t|  t) 


+pt| t-ix(tl t-1))  (3-25) 


The  error-covariances  of  the  global  and  the  local  estimate  are  respectively 


t  t 


-I  -1 


-i  ritV1>'(R'1>rV1) 

Mm*?,  1  *  1 

1  =  1 


(3.27) 


t  t 


-i 


(3.28) 


Remark  1:  We  assumed  that  the  global  estimate  pt|t  ^  is  always  available 
since  this  was  our  prime  goal.  Local  estimates  are  not  necessarily 
computed. 

Remark  2:  From  (3.27)  and  (3.28)  the  following  relation  between  local  and 
global  estimates  can  be  proven: 

i  r(t)  i 

p^it  =  z  +  (i-r(t))p;it-i  •  (3-29) 

i=l 

It  is  now  clear  what  preprocessing  can  be  done  locally  in  order  to 
compress  the  data.  At  each  node,  the  data  vector  y^  and  parameters 
and  R*^  are  combined  to  retain  the  "compressed"  data: 


(i)  A  „<i>/,„<i>x-l  U) 


on) 


(3.31) 


T  (i)  * 

Lt 


Several  different  situations  may  occur,  whether  or  not  the  communication 
between  the  central  station  and  the  link  N  is  two-way,  one-way  or 
broken  down. 


CASE  (1):  Station  N.  is  linked  to  the  network  and  global  informa- 
-  x 

tion  is  accessible. 

According  to  (3.28),  can  be  computed  at  N. ,  and  a  local 

estimate  (3.26)  may  be  computed  "for  later  reference"  (i.e.,  if  at  the 


next  step  the  station  should  be  disconnected  from  the  net) .  To  the  central 
station,  or  one  of  the  nodes  performing  as  such,  z£*^  and  or 

is  sent.  The  latter  matrices  are  symmetric  of  size  n  x  n.  There¬ 


fore  they  contain  only 


n(n+l) 


independent  parameters.  In  total 


1 ^  +  np^  parameters  have  to  be  transmitted. 


CASE  (2):  Station  N.  is  linked  to  the  network,  but  has  no  informa- 
-  x  ’ 

tion  back  from  it. 

In  this  case  the  local  error  covariance  cannot  be  computed.  Thus  z£ 
and  have  to  be  sent.  A  suboptimal  local  estimate  can  still  be 

calculated.  In  many  cases  some  a  priori  information  about  x^_  will  be 
available  (e.g.,  from  a  previous  time  update).  If  absolutely  nothing  is 
known,  it  means  that  the  a  priori  estimate  can  be  arbitrary,  but  with 
infinite  variance.  This  is  equivalent  to  setting  |  equal  to  zero 

in  the  formulas  (3.26)  and  (3.28). 

The  local  estimate  is; 
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(3.32) 


A 

X 


(i) 


(t  1 1 ) 


<pt !t)lHti)/(Rti))"lyti) 


where  we  set : 


Hj1)'(R(1,)'1h‘1)  =  l'4) 


(3.33) 


Thus  is  to  be  interpreted  as  the  local  error-covariance  in  the 

absence  of  a  priori  information. 


CASE  (3):  Station  N  does  not  contribute  to  the  network. 

Local  estimates  can  be  updated  (suboptimal)  with  or  without  the 

A  I 

access  to  x(t|t-l)  and  |t  At  the  central  node,  global  updating 
only  accounts  for  the  cooperating  nodes,  i.e.,  the  cases  (1)  and  (2) 
above.  If  nodes  i=l,...,r(t)  are  these  cooperating  nodes  at  t, 

then  equations  (3.25)  and  (3.27)  (or  (3.29))  are  used  to  compute  a 
global  filtered  estimate. 
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Time  Update 

The  next  step  is  to  compute  the  new  predicted  estimates  x(t+l|t) 
from  the  filtered  estimates  x(t|t).  Given  all  data  up  to  time  t, 
we  write  from  (3.1  ) 


‘t+1  |t 


Vt  It  +  ut  It 


(3.34) 


Since  is  uncorrelated  with  the  previous  noise  ug  and  vg  for 

—  A 

s  <  t,  and  with  x^,  we  have  that  ut  |t  =  The  time  update  is 
therefore  simply: 


t+1  t 


FtXt  |t 


and 


t+1  t 


F  P  i  F' 
t  t  t  t 


+  Q. 


(3.35) 


(3.36) 


for  the  global  estimates,  while  the  local  time-updates  are: 


c(1) 
t+1  |t 


(i) 
t+1  |t 


^  -'(i) 
Ftxt  It 


FtPt(  )lK  +  <at 


(3.37) 


(3.38) 


If  the  emphasis  is  on  an  optimal  global  estimate,  the  centralized  node 
updates  from  (3.34)  and  (3.35).  A  reason  for  local  computation  of 
estimates  is  for  robustness  and  optimality.  If  for  instance  is 


cut-off  from  the  network  but  local  estimates  are  updated,  then  the  global 
estimate  can  be  upgraded  with  this  "combined"  information  at  N,  as  soon 


as  the  link  is  restored,  and  thus  not  all  the  data  is  lost. 

The  above  discussion  indicates  that  there  are  many  ways  of  organiz¬ 
ing  the  data  traffic,  and  it  seems  fruitless,  if  not  impossible  to  give 
all  details. 
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4.  THE  DYNAMIC  ESTIMATION  PROBLEM:  BLOCK  PROCESSING 

4.1  Problem  Statement 

For  the  problem,  described  in  Section  3.1,  we  give  here  an  optimal 
solution  in  case  one  does  not  need  the  estimates  at  each  time  instant, 
or  in  case  the  communication  is  impossible  during  intervals  of  fixed 
length.  In  the  previous  section  we  mentioned  the  case  of  an  "isolated" 
node.  During  its  cut-off  time,  the  new  incoming  data  was  gathered  and 
processed  to  update  the  old  local  data.  These  resulting  data  were  trans 
mitted  at  the  time  the  link  was  restored,  and  incorporated  in  an  optimal 
"forward"  way.  At  that  point  the  globa'  estimate  will  not  equal  the 
optimal  estimate  provided  by  the  Kalman  filter  that  had  access  to  all 
data  y^  at  time  tj  This  loss  of  accuracy  is  balanced  by  a  gain  in 
communication  and  time.  Indeed  to  provide  the  best  estimate  given  all 
the  data,  including  the  batch  of  missed  data,  the  global  estimator 
would  have  to  recompute  previous  error  covariances  and  estimates,  thus 
momentarily  slip  back  to  the  instant  the  node  was  cut-off  and  reprocess 
the  accumulated  data  step-by-step. 

We  present  here  an  alternative  method  for  the  case  where  the  system 
parameters  are  a  priori  known,  and  each  node  has  a  copy  of  this  availabl 
The  data  is  gathered  in  batches  at  each  node,  and  synchronized  with  all 
stations  at  the  end  of  the  interval,  the  accumulated  processed  data  is 
sent  to  the  global  estimator,  which  then  proceeds  to  compute  the  optimal 
global  estimate.  For  the  development  of  the  algorithm,  our  scattering 
approach  to  the  filtering  and  smoothing  problem  is  used  [7]  since  it 
provides  a  direct  insight  into  the  method. 


4.2  A  Distributed  Block-Processing  Algorithm 


„  r  (i)  ^  (i)  A  _  (i) 

We  have  seen  that  the  quantities  H  y  =  ^  H  y  =  ^  z 

ss  ,  «  s  s  .  ,  s 
i=l  1=1 


and 


H'H  =  l  =  l  L(l)  lead 

S  S  L,  <z  Q  *-*  Q 


to  a  distributed  computation 


i=l 

in  a  natural  way. 


i=l 


In  a  step  by  step  updating  algorithm,  then,  we  compute  locally  the 

and  z^\  transmit  these  data  to  the  central  station,  where  now 
s  s 

the  updating  occurs  (i.e.,  one  'layer'  is  added  to  the  scattering  picture.) 

Further  distributing  is  possible  if  we  can  divide  the  whole  estima¬ 
tion  interval  (t,t)  up  in  some  subintervals  of  each,  say,  o  steps  long. 
The  a  elementary  scattering  layers  can  thus  be  combined  to  a  scattering 
layer  representing  such  a  subinterval.  The  'parameters'  are: 

$(s+a,s)  P(s+cr,s)| 


S(s+a,s) 


-C(s+a,s) 


'(s+a , s) 


(4.1) 


These  parameter-matrices  can  be  computed  if  (Fg,Gg,Hs)  ^nown* 

Further,  we  can  also  combine  all  sources  into  a  qQ(s-H?,s)  and 
q^  (s-Hj ,  s) . 
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>vVv\v'“ 


Suppose  now  that  our  model  is  stationary.  Then  for  each  'time  slot' 
the  $,P,0  and  ip  =  §'  are  the  same,  and  so  are  the  gains  G*  and 
i  =  where  we  define 


q+(s+a,s)  =  I  Gi  *  z(s+i) 

i=l 


(4.2) 


qQ(s+a,s)  =  I  ‘  z(s+i) 

i=l 

Now,  we  let  each  station  compute  locally:  q^^  and  q  in  a  certain 

subinterval  (say  k)  (which  is  done  sequentially) .  At  the  end  of  the  sub¬ 
interval  (of  length  a)  the  "data"  q  ^  and  q*^  from  all  stations 
Ni  are  transmitted  to  a  'central  location',  where  the  data  is  combined 
and  the  whole  scattering  layer  is  formed,  i.e., 


I  q"(i) 

i=l  ° 


V  +U> 


+  _  f  + 
3o  '  .2  qo 


(4.3) 


While  the  stations  proceed  now  gathering  more  data  and  computing  the 
values  for  q^  and  qQ  in  the  subsequent  subinterval,  the  central  station 
combines  the  (combined)  picture  of  the  previous  k-1  subintervals  with 
the  kC^  one,  thus  obtaining  the  4>,’^,P,0,qo  and  q*  over  the  interval 
(T,T+kc)  (Fig.  4.1). 


So(T+(k-l)q,T)  '  Sq  (x+ka,x+(k-l)q)r  =  SQ(T+ka,T) 


Finally,  when  the  last  subinterval  is  processed,  the  'boundary  layer' 
representing  the  initial  condition  is  attached,  and  the  estimates  are 
computed.  We  can  represent  the  above  procedure  on  a  space/time  flow¬ 
chart,  the  vertical  direction  being  time,  the  horizontal  direction  the 
space.  For  clarity  we  only  display  one  substation.  (Fig.  4.2). 


-\y.  - 


k- . . _ _ _ 


CENTRAL  STATION 


The  Algorithm: 


1 )  Precomputation  of  the  "Data-Gains"  and  "Block  Parameters" 

If  the  parameters  are  not  constant,  they  have  to  be  recombined  in 
each  interval  by  iteration  of  the  scattering  formulas  in  order  to  obtain 
the  "block-scattering  parameters"  subject  to  the  initial  conditions 
[7],  In  Fig.  4.2  this  is  indicated  by  the  box  containing  (o,©,?,1?). 

The  gains  G*;  Gg  for  s  =  l,...,a  are  also  obtained  by  iteration  [7]. 
All  nodes  are  supposed  to  have  these  parameters  available. 


2 )  Local  Data  Processing 

At  the  beginning  of  each  block  we  reset  and  to 

zero.  As  data  comes  in,  we  proceed  step-by-step 


(i) 


H'(i)y(i> 
s  s 


+  (i) 
^s 


+  G±z(1> 
s  s 


until  s  =  a  when  the  contents  of 
sent  to  the  central  station. 


+  (i) 


and 


registers  are 


3)  Central  Computation  (k  Block) 

The  new  block  is  added  to  the  previously  combined  k-1  blocks.  As 
shown  in  Fig.  4.1,  the  composite  is  given  by  the  Redheffer  "star-product" 
[10]: 
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a  b 

“A  b~ 

— A ( I-bC ) _1a 

B4Ab(I-Cb)“1D 

* 

= 

-1 

-1 

c  d 

_C  D_ 

c+dC  ( I-bC )  a 

d(I-Cb)  D 

The  internal  "sources"  are  similarly  combined: 


Q+ 

4 

+ 

q 

+ 

r 

n  t> 

—  — 

+ 

r 

A(I-bC)-1 (q++br~) 

+ 

+ 

—  1  —  4. 

_q"_ 

_q"_ 

r 

_q  _ 

_d (I-Cb)  (r  -KSq  )_ 

y\  |  t  tl 

Finally,  the  filtered  estimates  x(t|t)  at  the  end  of  the  k  block 
are  found  by  "appending"  the  end-conditions  X(t|t)  =  0.  The  initial 
("boundary  layer"  accounts  for  the  a  priori  knowledge.  This  layer  is 
represented  by: 

(4.6) 

Comparison  of  number  of  data  transmissions  in  Section  4  and  Section  3. 

1.  Step  by  step  Kalman  Filter. 

Each  sampling  instant  requires  +  n  numbers  to  be  sent. 

Thus,  in  total  for  an  interval  of  length  t  -  I 


'Partitioned'  filter. 

(Note  that  we  discussed  the  smoothing  problem  here,  but  the  filtering 
formulas,  and  therefore  the  procedure,  are  identical  in  form). 

t  T" 

There  are  now  — - — -  transmissions  required  of  the  'data' 

-i-  _ 

o'  and  q  2  vectors  of  size  n.  Thus: 
o  o 

it  =  C  -  T  •  2n 
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5 


DISTRIBUTED  DETECTION  ALGORITHMS 


A  common  problem  in  sensor-systems  is  to  decide  whether  a  signal  is 
picked  up,  imbedded  in  noise,  or  if  we  only  "observe"  the  noise.  These 
signal-detection  problems  arise  in  radar,  sonar  and  seismic  sensors. 

There  are  two  hypotheses: 

signal  plus  noise  is  observed; 
noise  only  is  observed. 

More  general  we  shall  consider  the  case  where  there  are  M  hypotheses. 
The  observations  are  then  likewise  used  to  decide  which  of  the  M  models 
corresponds  "best"  to  these  data.  There  are  many  other  applications  for 
hypothesis-testing;  for  instance:  component  failure  detection,  pattern 
recognition,  system  identification  [9],  Details  on  the  theory  can  be 
found  in  many  books,  e.g. ,  [9,],  Cl 1  ] ,  and  [  12 ] .  Here  we  shall  look  at 
some  particular  cases. 


5.1  Static  Systems  Hypothesis  Testing 

Suppose  again  that  we  have  r  disjoint  sensors  N^;  1-1,..., r 
where  observations  are  made.  Let  us  have  the  following  M 

hypotheses  (at  each  station): 


W(i>: 

j 


H(i)x.  +  v(i) 
J  J  J 


j=l, . . . ,M 


(5.1) 


where 


Xj  is  N(0,7I  ) 


(D  ,  K/n  n(i>. 
v  .  is  N(0,R .  ) 

J  3 


(5.2) 

(5.3) 


and  further  x.  and  vf1^  independent,  and  v^-independent  from  all 
J  3  3 

V^i}  if  l  4  i. 


Clearly,  the  hypotheses  are  about  the  same  (global)  parameter  x 
that  may  assume  the  values  x^...x^.  Therefore  if  we  consider  the  combined 
problem  by  stacking  together  the  observations  from  each  node,  then  we 
have  not  Mr  but  still  r  hypotheses. 


H  , :  y  =  H.x.+v. 

j  J  J  J 


j  =  l,...  ,M 


where 


y  . ,  H  .  and 
J  J 


E  v  .v*  = 

J  J 


v.  are  defined  as  in  (2.7)  and 
J 

Block  diagonal  I 


(5.4) 


(5.5) 


It  is  intuitively  clear  that  by  combining  data  from  r  sensors,  better 
decisions  will  be  made  than  if  each  detector  would  "decide"  based  on  its 


hypothesis  is 


own  data.  The  log  likelihood  function  §  for  the  j 

[9]: 


.th 


2|(y)  =  Vy'(HjnjHj  +  Rj)_1y  . 


(5.6) 


where  c  is  a  data-independent  constant  determined  by  such  factors  as 

a  priori  information  on  the  probability  of  the  hypothesis  and  the  penalties 

for  incorrect  decisions.  In  general,  the  decision  is  then  to  take  the 

hypothesis  H.  if  j=k  yields  maximum  £.(z).  For  a  binary  decision 
k  J 

(M=2)  the  log  likelihood  ratio  l  is  more  useful: 


i(y)  =  lx(y)  -  S2(y>  ■  (5.7) 

The  equivalent  decision  rule  is  then  to  compare  2(y)  to  a  threshold  X 
which  depends  again  on  the  a  priori  probabilities  and  the  costs  of  incorrect 
decisions. 


Choose 

W1 

if 

i(y) 

s  X 

Choose 

*2 

if 

JL  (y) 

r< 

V 

• 

The  similarity  in  the  forms  of  equations  (5.6)  and  (2.10)  illustrate  the 
close  relation  between  detection  and  estimation, and  accordingly  distributed 
schemes  exist. 

Equation  (5.6)  can  indeed  be  rewritten  as: 

2 % .  (y)  =  c  .-y,R_1y  -  y'R^'H  .(n"1+H,.R"1H)~1H.RT1y  (5.8) 

J  J  J  3  3  J  J  J  J 
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This  form  makes  clear  how  the  computation  may  be  distributed  using  methods 
very  similar  to  what  we  used  in  the  estimation  problem.  The  local  computa¬ 
tion  at  node  N.  involves: 

1 


L(i>  =  Ha>,(R(i)  -iH  «) 

J  J  J  J 


*<l>  -  y<1>'a.J(1>>-V1> 


j=l , . . . ,M 


(5.9) 


(i)  „(!)/ ,_<i).-l  (i) 

zj  =  Hj  <Rj  >  y 


The  central  station  then  assembles  the  log  likelihood  functions: 


2^(y) 


E 


-  (  E  z4(1))'(  3T-1+  E  L(i)  )-1(  E  Z(i))  (5.10) 


and  compares  them  again?  each  other,  to  decide  which  hypothesis  should  be 
taken. 

Transmission  of  the  pf'*"'  and  zf^;  i=l,.,,,r;  j=l,.,,,M 

JO  J 

involves  maximally  M  +  1  +  nj  parameters  compared  to  the 


P1(P1+1) 


r  r  f  p.(p 

E  p  +  M  E  (  p  n  +  - =■ 

i=l  1  i=l  V  1  ^ 


that  are  required  for  efficient  transmission 


of  the  y^\  and  From  the  point  of  view  of  data  transmission, 

j  J 

one  prefers  distribution  if: 


n  +  3n  +  2  ^  (2n  +  1  +  2/M)  E  p.  +  E 


(5.11) 


5.2  Dynamic  Systems:  Hypothesis  Testing 


Just  as  in  the  static  case,  we  may  expect  a  close  similarity  between 
estimation  and  detection  and  thus  the  aspects  of  distributed  processing 
can  also  be  fruitful  in  this  case. 

Of  particular  interest  are  the  applications  to  system  identification. 
Suppose  we  have  M  hypothesized  linear  system  models: 


x(t+l)  = 


Fj (t)x(t)  +  Uj (t)  ;  j=l,, 


(5.12) 


y (1)  (t )  =  Hj1)  (t)x(t)  +  Vj(t)  ;  i=l,, 


(5.13) 


with:  u.(t)  =  white  noise  with  zero  mean  and  covariance  Q,  and  x(0) 

J  J 

normally  distributed  with  parameters  (0,  HQ).  The  assumptions  on  the 
observations  (5.13)  are  as  usual,  and  we  may  again  combine  them  as  in 
(5.4). 

A  decision  at  time  t  is  essentially  based  (see  [9,  p.  286])  on 
computation  of  the  M  log  likelihood  functions,  and  on  subsequent  compar- 


The  ?^(t)  are  g*ven  by: 


s.(t)  =  -  E  (y(t)-H  (t)x.(t|t-l))' (H,(t)P.(t|t-l)H'(t)+R  (t))‘ 

J  -  iJ  J  J  JJ 

S  =  1  J 


(y(t)-Hj (t)x  (t |t-l)) 


(5.14) 


where  x  (t|t-l)  is  the  predicted  estimate  under  the  hypothesis  & ,  and 

J  ^ 

P  (tjt-l)  is  the  corresponding  error-covariance. 


-  -  *  v  v  •  *  .  *  .  ” 


•  v\  -- 


Various  distributed  algorithms  for  the  estimation  were  already  consid¬ 
ered  in  previous  sections,  and  (5.14)  can  again  be  rewritten  involving  the 

r  r*  T 

sums  Z  z.  ;(t),  Zj  (j.  .  it)  and  Z  L.  ;(t).  We  will  omit  the  details. 

i=l  J  i=l  3  i=l  3 

We  end  this  section  with  the  observation  that  in  the  case  of  non-a  priori 

knowledge  of  H  and  R,  the  detection  problem  requires  the  additional 

scalars  "p"  which  were  not  needed  in  the  corresponding  estimation  problem. 

In  the  case  of  a  priori  knowledge  (hypothesized)  of  and  R^1^, 

J 

the  data  can  be  processed  optimally  in  batches  as  in  the  estimation  problem 
of  Section  4. 
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6. 


CONCLUSIONS 


Some  algorithms  for  the  distributed  estimation  and  detection  of  both 
static  and  dynamic  linear  models  with  Gaussian  modeled  uncertainties  were 
presented.  The  emphasis  was  on  global  estimates  and  decisions  rather  than 
on  the  local  ones.  The  detection  algorithms  were  shown  to  follow  in  a 
straightforward  way  from  their  estimation  counterparts.  Based  on  our 
scattering  approach  we  derived  an  optimal  distributed  block  processing 
algorithm  for  the  case  where  each  sensor  has  a  copy  of  the  system.  These 
results  are  believed  to  be  new.  Further  a  new  step-by-step  procedure  in 
the  case  of  unknown  observation  parameters  was  developed.  It  allows  for 
reconfigurability  and  survivability  as  the  number  of  participating  nodes 
may  vary,  while  still  suboptimal  global  estimates  are  given.  If  the 
global  estimate  and  covariance  are  fed  back  to  the  nodes,  virtually  every 
node  can  perform  the  "centralized"  computations,  thus  providing  failure 
robustness.  Finally,  prprocessing  at  the  sensor  location  may  enclose  some 
valuable  information  about  the  sensors  or  their  location,  a  welcome  feature 
in  some  applications. 
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APPENDIX  A 


A  note  on  the  transmission  of  the  quantity 

We  assume  that  x  and  are  respectively  n-  and  p  - 

dimensional.  Then  has  p^-n  parameters,  l/^  is  a  symmetric 

nxn  matrix,  and  therefore  contains  n independent  parameters. 

For  this  we  could  for  instance  take  the  lower  triangular  part  of 
denoted  as  [L^^]  .  (Thus  an  additive  decomposition  of  L^^), 
or  we  may  consider  a  multiplicative  decomposition  of  into  its 

Cholesky  triangular  factors.  We  shall  consider  in  greater  detail  why 
we  might  do  this. 

1.  Additive  Decompositions 

Obviously  to  get  [L^^].  from  no  additional  computation  at 

station  is  required.  If  at  station  NL  the  information  of  station 

is  to  be  added  to  the  local  information,  then  simply  station 
performs : 


i)  L ^  -*■  [L  ^  ]+  local  information 

ii)  [L^^]+  =  +  [l/^]  +  updating  with  information 

f  rom  N . 

x 

Again  step  ii)  is  a  simple  set  of  n additions  of  scalars,  and 
the  result  is  in  a  direct  transmittable  form.  The  'end  station'  needs 
to  reconstruct  the  L  from  the  [L]  +  :  this  is  a  simple  information  shift. 

A  note  about  secure  transmission:  For  instance,  in  a  tracking  problem 
the  parameters  in  depend  on  the  local  coordinates  of  the  tracking 

station,  and  in  some  applications  we  may  want  to  keep  this  secret. 

But  now: 


A.  1 


v' 


[L(1)] 


I  H(J)  H  j } 
,  mk  m2 
m=l 


k  >  2 


This  yields  -  equations  for  the  p^n  unknowns  ’ 

If  these  nonlinear  equations  can  be  solved,  at  least  in  principle, 

revealing  the  .  Remember,  however,  that  we  opted  for  a  distributed  processing 

if  p  _>  +  1.  To  keep  secrecy,  we  can  inject  redundancy  (i.e.,  aug¬ 

menting  p^).  This  has  no  additional  effects  on  the  transmission  require¬ 
ments,  although  it  increases  the  (local)  computational  burden  to  compute 
L^.  Another  possibility  is  to  scramble  the  transmitted  data  +  , 

but  then  it  has  to  be  descrambled. 

2 •  Multiplicative  Decomposition 

Each  station  does  a  triangular  factorization  of  the  data  L^. 

Let  be  a  lower  triangular  factor  of  then: 

H'H  =  L  =  U  U' 

That  means  that  at  station  we  do  not  have  to  compute  actually, 

but  just  a  triangular ization  of  H'  (by  column  operations) 


ni  H 


If  station  N.  transmits  the  data  Uv  to  N.,  then  N.  can  directlv 

J  i 

compute  an  update  incorporating  his  own  data  .  Indeed, 

performing  column  operations  on  the  array  yields  the  lower  triangular 
factor  U  ^  ^ 


So,  local  information  processing  and  updating  (or  combining)  can  all  be 
done  in  one  step.  Further,  if 

h"h  =  u  u" 


then  U  is  of  the  form 

n 

P 

Let  p  be  the  number  of  nonzero  columns,  then  p  is  the  number  of 
observations  at  the  station,  and  these  p  columns  are  a  scrambled  version 
of  R' .  But  again  since  H^H  is  then  known,  we  can  solve  for  H  if 


Similarly,  if  we  know  the  incoming  and  outgoing  U  to  a  station, 


then: 


rU) 


Ni 


,(j) 


U  (J  >u  (j  )  "  -  u(i)u(i)' 


hO)-h(J) 


,(j) 


and  H  can  be  retrieved  if  the  number  of  nonzero  columns  in  the  tri¬ 
angular  factorization  of  is  less  or  equal  to 

n+1 


Thus  sending  the  parameters  in  'square-root'  form  does  not  provide 
us  with  higher  secrecy  nor  simplicity,  but  might  be  preferred  if  the 
inverse  of  a  triangular  matrix  is  found  by  simple  substitution,  and  the 
inverse  of  the  square  root  of  M  is  the  square  root  of  M  \  whereas 
in  general  [M]  +  *  ^  [M  . 


A.  3 


